module Nuggets::Array::HistogramMixin
Constants
- FORMATS
Provides some default formats for
formatted_histogram
.Example:
(default) ab [==] 2 (percent) xyz [===] 3 (37.50%) (numeric) 42 [==] 2 (numeric_percent) 123 [=] 1 (12.50%)
The “numeric” variants format the item as a (decimal) number.
- HistogramItem
Encapsulates a
histogram
item and provides the following attributes (see alsoannotated_histogram
):- item
-
The original item
- freq
-
The item's frequency in the collection
- percentage
-
The percentage of the item's frequency in the collection
- max_freq
-
The maximum frequency in the collection
- max_freq_length
-
The maximum frequency's “width”
- max_item_length
-
The maximum item length in the collection
Public Instance Methods
Calculates the histogram
for array and yields each histogram item (see HistogramItem
) to the block or returns an Array
of the histogram items.
# File lib/nuggets/array/histogram_mixin.rb 96 def annotated_histogram 97 hist, items = histogram, [] 98 99 percentage = size / 100.0 100 101 max_freq = hist.values.max 102 max_freq_length = max_freq.to_s.length 103 104 max_item_length = hist.keys.map { |item| item.to_s.length }.max 105 106 # try to sort the histogram hash 107 begin 108 hist = hist.sort 109 rescue ::ArgumentError 110 end 111 112 hist.each { |item, freq| 113 hist_item = HistogramItem.new( 114 item, freq, max_freq, max_freq_length, max_item_length, freq / percentage 115 ) 116 117 block_given? ? yield(hist_item) : items << hist_item 118 } 119 120 block_given? ? hist : items 121 end
Returns the histogram
of array as a formatted String
according to format
, using indicator
to draw the frequency bar.
format
may be a Symbol indicating one of the provided default formats (see FORMATS
) or a format String
(see Kernel#sprintf) that will receive the following arguments (in order):
-
max_item_length
(Integer
) -
item
(String
) -
“frequency_bar” (
String
) -
“padding” (
String
) -
max_freq_length
(Integer
) -
freq
(Integer
) -
percentage
(Float, optional)
See HistogramItem
for further details on the individual arguments.
# File lib/nuggets/array/histogram_mixin.rb 142 def formatted_histogram(format = :default, indicator = '=') 143 format = FORMATS[format] if FORMATS.key?(format) 144 raise ::TypeError, "String expected, got #{format.class}" unless format.is_a?(::String) 145 146 include_percentage = format.include?('%%') 147 indicator_length = indicator.length 148 149 lines = [] 150 151 annotated_histogram { |hist| 152 arguments = [ 153 hist.max_item_length, hist.item, # item (padded) 154 indicator * hist.freq, # indicator bar 155 (hist.max_freq - hist.freq) * indicator_length, '', # indicator padding 156 hist.max_freq_length, hist.freq # frequency (padded) 157 ] 158 159 arguments << hist.percentage if include_percentage # percentage (optional) 160 161 lines << format % arguments 162 } 163 164 lines.join("\n") 165 end
Calculates the frequency histogram of the values in array. Returns a Hash
that maps any value, or the result of the value yielded to the block, to its frequency.
# File lib/nuggets/array/histogram_mixin.rb 68 def histogram 69 hist = ::Hash.new(0) 70 each { |x| hist[block_given? ? yield(x) : x] += 1 } 71 hist 72 end
Calculates the probability mass function (normalized histogram) of the values in array. Returns a Hash
that maps any value, or the result of the value yielded to the block, to its probability (via histogram
).
# File lib/nuggets/array/histogram_mixin.rb 82 def probability_mass_function(&block) 83 hist, n = histogram(&block), size.to_f 84 hist.each { |k, v| hist[k] = v / n } 85 end