Here's the full question with a code editor if you want to try it out!: www.interviewquery.com/questions/automatic-histogram
@Snorrason2 жыл бұрын
Nicely done! Just to note that none of the tests cover cases where a bin would have zero values. At a quick glance, it could be remedied by initializing a `count` variable to zero (instead of`histogram[bucket_str] = 0`) before the last for-loop. Once you break out of the loop, you can check if the count is non-zero before adding it to the histogram.
@yemanbrhane39003 ай бұрын
The number of bins should be (max -min)/x
@tokpot2 жыл бұрын
The solution is wrong... and it's quite easy to test how : dataset = [7,8,9] x = 2 returned : {7-9 : 3} here's a pretty much 'coverall' solution using no libraries : def automatic_histogram(dataset, x): histogram = {} if len(dataset)==0: return histogram if len(set(dataset))>x: return histogram min_number = dataset[0] max_number = dataset[0] for i in dataset: if i < min_number: min_number = i if i > max_number: max_number = i if min_number==max_number: return {str(min_number):len(dataset)} width = (max_number-min_number)//x edges = [min_number, *[min_number + (i+1)*(width+1) for i in range(x-1)]] bins = [str(edges[i])+'-'+str(edges[i+1]-1) if edges[i]!=edges[i+1]-1 else str(edges[i]) for i in range(x-1)] if edges[-1]==max_number: bins.append(str(max_number)) else: bins.append(str(edges[-1])+'-'+str(max_number)) for b in bins: histogram[b] = 0 for i in range(x-1): for d in dataset: if edges[i]
@DanielMichaelFrees4 ай бұрын
I believe this is forgetting to drop bins with count 0. I also wonder if an interviewer might expect the function to adjust bins for count 0 so that we always achieve number of bins = x if possible (this would be an uglier question if so).
@piyushagarwal83302 жыл бұрын
Started with not using sort to decrease complexity from Onlogn then added a for loop in while loop for On². Btw very nice solution, wasn't able to solve myself.
@Snorrason2 жыл бұрын
Fortunately, the while loop goes closer to O(x * bucket_width) in time complexity, an improvement on O(n²).
@danielaldana69572 жыл бұрын
What level of difficulty is this one?
@nedaebrahimi67352 жыл бұрын
instead of using loops for calculating min and max we could easily use the min(dataset) and max(dataset) and use Counter to count the number of elements in the dataset. here is my solution: def histogram(data, x): res={} min_value=min(data) max_value= max(data) bins= math.ceil((max_value)/x) counts= Counter(data) if max_value==min_value: return {str(max_value):counts[max_value]} i=min_value while i
@abyszero86202 жыл бұрын
My kingdom for a space bar
@jp_magal2 жыл бұрын
I liked both question and solution! But I don't think it would pass test case with a dataset containing only negative values, and the question doesn't state there can't be negatives, does it?
@d-rex70432 жыл бұрын
A histogram is usually analogous to a probability distribution and you can't have a negative probability
@DilanChecker Жыл бұрын
Not true. Example: a voltage measurement. It is normal distributed since you have statistical errors in the measurement, but it can have negative values or even a mix if you measure close to V=0
@AhmedMahmoud-wz6du2 жыл бұрын
Nice Question ! and smart answer 😄
@DanielMichaelFrees4 ай бұрын
My solution (using a map for easy value counting, without importing Counter). I believe it is fairly robust, though the question has some ambiguity. In particular around how to handle inexact division (though we can infer from the example that we want to drop extra range from the final bucket most likely. Additionally, the following solution will allow for the 'wrong' number of buckets in cases where we have buckets with value 0. That behavior is unclear from the directions. def automatic_histogram(dataset: List[int], x: int) -> dict: """ @param dataset(List[int]) the numbers to populate the histogram @param x(int): the number of bins for the histogram @return dict: dictionary containing keys that are string ranges spread uniformly over the dataset (except possibly a smaller final range in the case of odd # of input values (as a set)) """ # dataset = sorted(dataset) # make it quick to extract min, max, and iterate # # using O(nlogn) complexity once if x < 1: raise ValueError("Must have at least 1 bin") if not dataset: return {} # O(2n) min_val = min(dataset) max_val = max(dataset) val_range = max_val + 1 - min_val # e.g. (5+1-1) = 5 bin_size = ceil(val_range / x) print(bin_size) bin_range = x * bin_size extra_vals = bin_range - val_range # e.g. 5-3 = 2 histogram = {} value_map = {} #maps integers to their corresponding key in the histogram upper_val = None lower_val = min_val while upper_val != max_val: upper_val = lower_val + bin_size - 1 if upper_val > max_val: # bin range may be larger than value range upper_val = max_val if lower_val == upper_val: key = f"{lower_val}" value_map[lower_val] = key histogram[key] = 0 # single int value range else: key = f"{lower_val}-{upper_val}" for val in range(lower_val, upper_val+1, 1): value_map[val] = key histogram[key] = 0 lower_val += bin_size for val in dataset: key = value_map[val] histogram[key] += 1 # add count for this val to corresponding bin # removal of zero values from histogram final_histogram = {} for key, val in histogram.items(): if val == 0: continue final_histogram[key] = val return final_histogram
@teslarocks3 ай бұрын
The solution provided is not fully correct. I understand the stress under timed interview. Here's my solution given ample time. Thank you, both the interviewer and interviewee for the mock interview. It's quite helpful to me. from collections import defaultdict import math def automatic_histogram(dataset, x): min_num = dataset[0] max_num = dataset[0] freq_count = defaultdict(int) for d in dataset: if d > max_num: max_num = d if d < min_num: min_num = d freq_count[d] += 1 bin_size = math.ceil((max_num - min_num) / x) histogram = defaultdict(int) for n, f in freq_count.items(): bin = (n-min_num) // bin_size if n == max_num and (max_num-min_num)%bin_size == 0: bin_str = str(max_num) else: bin_str = str(min_num + bin*bin_size) + '-' + str(min_num + bin*bin_size + bin_size-1) histogram[bin_str] += f return dict(histogram) ### Tests: x = 4 dataset = [-4,-2,0,1,2,2,5] res = automatic_histogram(dataset, x) print(res) # Answer: {'-4--2': 2, '-1-1': 2, '2-4': 2, '5': 1} x = 5 dataset = [1,2,3,4,5,6,7,8,9] res = automatic_histogram(dataset, x) print(res) # Answer: {'1-2': 2, '3-4': 2, '5-6': 2, '7-8': 2, '9': 1} x = 3 dataset = [1,2,4,4,5,6,6,8,9] res = automatic_histogram(dataset, x) print(res) # Answer: {'1-3': 2, '4-6': 5, '7-9': 2}
@amanmiglani60432 жыл бұрын
Sir where we can get such type of questions? I hardly fined it on internet And now days many companies started asking in there 1st round of coding such type of questions
@iqjayfeng2 жыл бұрын
Check out Interview Query!
@saurabhshrivastava51982 жыл бұрын
Salary comparison between Machine Learning Enginner or Software Engineer
@TheElementFive2 жыл бұрын
You want to work for Google but you can’t use it to find this information?