Python NumPy library has many aggregate or statistical functions for doing different types of tasks with the one-dimensional or multi-dimensional array. Some of the useful aggregate functions are mean(), min(), max(), average(), sum(), median(), percentile(), etc. The uses of mean(), min(), and max() functions are described in this tutorial. The mean() function is used to return the arithmetic mean value of the array elements. The arithmetic mean is calculated by dividing the sum of all elements of the array by the total number of array elements. If the particular axis is mentioned in the function, then it will calculate the mean value of the particular axis. max() function is used to find out the maximum value from the array elements or the elements of the particular array axis. min() function is used to find out the minimum value from the array elements or the particular array axis.
Use of mean() function
The syntax of the mean() function is given below.
Syntax:
Transform features by scaling each feature to a given range. This estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. Between zero and one. The transformation is given by: Xstd = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)) Xscaled = Xstd. (max - min) + min.
- Numpy.amax Python’s numpy module provides a function to get the maximum value from a Numpy array i.e. Numpy.amax(a, axis=None, out=None, keepdims=, initial=).
- 125 numpy.amax will find the max value in an array, and numpy.amin does the same for the min value. If I want to find both max and min, I have to call both functions, which requires passing over the (very big) array twice, which seems slow.
numpy.mean(input_array, axis=None, dtype=None, out=None, keepdims=<no value>)
This function can take five arguments. The purposes of these arguments are described below:
input_array
It is a mandatory argument that takes an array as the value and the average of the array values is calculated by this function.
axis
It is an optional argument, and the value of this argument can be an integer or the tuple of integers. This argument is used for the multi-dimensional array. If the value of the axis is set to 0, then the function will calculate the mean of the column values, and if the value of the axis is set to 1, then the function will calculate the mean of the row values.
dtype
It is an optional argument that is used to define the data type of the mean value.
out
It is an optional argument and is used when the output of the function will need to store in an alternative array. In this case, the dimension of the output array must be the same as the input array. The default value of this argument is None.
keepdims
It is an optional argument, and any Boolean value can be set in this argument. It is used to transmit the output properly based on the input array.
This function returns an array of mean values if the value of the out argument is set to None, otherwise the function returns the reference to the output array.
Example: Using mean() function
The following example shows how the mean value of a one-dimensional and two-dimensional array can be calculated. Here, the first mean() function is used with a one-dimensional array of integer numbers, and the second mean() function is used with a two-dimensional array of integer numbers.
# import NumPy libraryimport numpy as np
# Create a one-dimensional array
np_array = np.array([6,4,9,3,1])
# Print array and mean values
print(“The values of the one-dimensional NumPy array are:n “, np_array)
print(“The mean value of the one-dimensional array is:n“, np.mean(np_array))
# Create a two-dimensional array
np_array = np.array([[5,3,5],[5,4,3]])
# Print array and mean values
print(“nThe values of the two-dimensional NumPy array are:n “, np_array)
print(“The mean values of the two-dimensional array are:n“, np.mean(np_array, axis=0))
Output:
The following output will appear after executing the above script.
Use of max() function
The syntax of the max() function is given below.
Syntax:
numpy.max(input_array, axis=None, out=None, keepdims=None, initial=None, where=None)
This function can take six arguments. The purposes of these arguments are described below:
input_array
It is a mandatory argument that takes an array as the value, and this function finds out the maximum value of the array.
axis
It is an optional argument, and its value can be an integer or the tuple of integers. This argument is used for the multi-dimensional array.
out
It is an optional argument and is used when the output of the function will need to store in an alternative array.
keepdims
It is an optional argument, and any Boolean value can be set in this argument. It is used to transmit the output properly based on the input array.
initial
It is an optional argument that is used to set the minimum value of the output.
where
It is an optional argument that is used to compare the array elements to find out the maximum value. The default value of this argument is None.
This function returns the maximum value for the one-dimensional array or an array of the maximum values for the multi-dimensional array.
Example: Using max() function
The following example shows the use of the max() function to find out the maximum value of a one-dimensional array.
# import NumPy libraryimport numpy as np
# Create NumPy array of integers
np_array = np.array([21,5,34,12,30,6])
# Find the maximum value from the array
max_value = np.max(np_array)
# Print the maximum value
print(‘The maximum value of the array is: ‘, max_value)
Output:
The following output will appear after executing the above script.
Use of min() function
The syntax of the min() function is given below.
Syntax:
numpy.min(input_array, axis=None, out=None, keepdims=None, initial=None, where=None)
The purposes of the arguments of this function are the same as the max() function that has been explained in the part of the max() function. This returns the minimum value of the input array.
Example: Using min() function
The following example shows the use of the min() function to find out the minimum value of a one-dimensional array.
# import NumPy libraryimport numpy as np
# Create NumPy array of integers
np_array = np.array([21,5,34,12,30,6])
# Find the maximum value from the array
max_value = np.max(np_array)
# Print the maximum value
print(‘The maximum value of the array is: ‘, max_value)
Output:
The following output will appear after executing the above script.
Conclusion
The purposes of three useful aggregate functions (mean(), max(), and min()) have been explained in this tutorial to help the readers to know the ways of using these functions in python script.
numpy¶
Hypothesis offers a number of strategies for NumPy testing,available in the hypothesis[numpy]
extra.It lives in the hypothesis.extra.numpy
package.
The centerpiece is the strategy, which generates arrays withany dtype, shape, and contents you can specify or give a strategy for.To make this as useful as possible, strategies are provided to generate arrayshapes and generate all kinds of fixed-size or compound dtypes.
hypothesis.extra.numpy.
from_dtype
(dtype, *, alphabet=None, min_size=0, max_size=None, min_value=None, max_value=None, allow_nan=None, allow_infinity=None, exclude_min=None, exclude_max=None)[source]¶Creates a strategy which can generate any value of the given dtype.
Compatible **kwargs
are passed to the inferred strategy function forintegers, floats, and strings. This allows you to customise the min and maxvalues, control the length or contents of strings, or exclude non-finitenumbers. This is particularly useful when kwargs are passed through from which allow a variety of numeric dtypes, as it seamlesslyhandles the width
or representable bounds for you. See issue #2552for more detail.
hypothesis.extra.numpy.
arrays
(dtype, shape, *, elements=None, fill=None, unique=False)[source]¶Returns a strategy for generating numpy.ndarray
s.
dtype
may be any valid input todtype
(this includesdtype
objects), or a strategy thatgenerates such values.shape
may be an integer >= 0, a tuple of such integers, or astrategy that generates such values.elements
is a strategy for generating values to put in the array.If it is None a suitable value will be inferred based on the dtype,which may give any legal value (including egNaN
for floats).If a mapping, it will be passed as**kwargs
tofrom_dtype()
fill
is a strategy that may be used to generate a single backgroundvalue for the array. If None, a suitable default will be inferredbased on the other arguments. If set to then fillingbehaviour will be disabled entirely and every element will be generatedindependently.unique
specifies if the elements of the array should all bedistinct from one another. Note that in this case multiple NaN valuesmay still be allowed. If fill is also set, the only valid values forit to return are NaN values (anything for whichnumpy.isnan
returns True. So e.g. for complex numbers (nan+1j) is also a valid fill).Note that if unique is set to True the generated values must be hashable.
Arrays of specified dtype
and shape
are generated for examplelike this:
See What you can generate and how.
Array values are generated in two parts:
Some subset of the coordinates of the array are populated with a valuedrawn from the elements strategy (or its inferred form).
If any coordinates were not assigned in the previous step, a singlevalue is drawn from the fill strategy and is assigned to all remainingplaces.
You can set fill to if you want todisable this behaviour and draw a value for every element.
If fill is set to None then it will attempt to infer the correct behaviourautomatically: If unique is True, no filling will occur by default.Otherwise, if it looks safe to reuse the values of elements acrossmultiple coordinates (this will be the case for any inferred strategy, andfor most of the builtins, but is not the case for mutable values orstrategies built with flatmap, map, composite, etc) then it will use theelements strategy as the fill, else it will default to having no fill.
Having a fill helps Hypothesis craft high quality examples, but itsmain importance is when the array generated is large: Hypothesis isprimarily designed around testing small examples. If you have arrays withhundreds or more elements, having a fill value is essential if you wantyour tests to run in reasonable time.
hypothesis.extra.numpy.
array_shapes
(*, min_dims=1, max_dims=None, min_side=1, max_side=None)[source]¶Return a strategy for array shapes (tuples of int >= 1).
hypothesis.extra.numpy.
scalar_dtypes
()[source]¶Return a strategy that can return any non-flexible scalar dtype.
hypothesis.extra.numpy.
unsigned_integer_dtypes
(*, endianness='?', sizes=(8,16,32,64))[source]¶Return a strategy for unsigned integer dtypes.
endianness may be <
for little-endian, >
for big-endian,=
for native byte order, or ?
to allow either byte order.This argument only applies to dtypes of more than one byte.
sizes must be a collection of integer sizes in bits. The default(8, 16, 32, 64) covers the full range of sizes.
hypothesis.extra.numpy.
integer_dtypes
(*, endianness='?', sizes=(8,16,32,64))[source]¶Return a strategy for signed integer dtypes.
endianness and sizes are treated as for.
hypothesis.extra.numpy.
floating_dtypes
(*, endianness='?', sizes=(16,32,64))[source]¶Return a strategy for floating-point dtypes.
sizes is the size in bits of floating-point number. Some machines support96- or 128-bit floats, but these are not generated by default.
Larger floats (96 and 128 bit real parts) are not supported on allplatforms and therefore disabled by default. To generate these dtypes,include these values in the sizes argument.
hypothesis.extra.numpy.
complex_number_dtypes
(*, endianness='?', sizes=(64,128))[source]¶Return a strategy for complex-number dtypes.
sizes is the total size in bits of a complex number, which consistsof two floats. Complex halves (a 16-bit real part) are not supportedby numpy and will not be generated by this strategy.
hypothesis.extra.numpy.
datetime64_dtypes
(*, max_period='Y', min_period='ns', endianness='?')[source]¶Return a strategy for datetime64 dtypes, with various precisions fromyear to attosecond.
hypothesis.extra.numpy.
timedelta64_dtypes
(*, max_period='Y', min_period='ns', endianness='?')[source]¶Return a strategy for timedelta64 dtypes, with various precisions fromyear to attosecond.
hypothesis.extra.numpy.
byte_string_dtypes
(*, endianness='?', min_len=1, max_len=16)[source]¶Return a strategy for generating bytestring dtypes, of various lengthsand byteorder.
While Hypothesis’ string strategies can generate empty strings, stringdtypes with length 0 indicate that size is still to be determined, sothe minimum length for string dtypes is 1.
hypothesis.extra.numpy.
unicode_string_dtypes
(*, endianness='?', min_len=1, max_len=16)[source]¶Return a strategy for generating unicode string dtypes, of variouslengths and byteorder.
While Hypothesis’ string strategies can generate empty strings, stringdtypes with length 0 indicate that size is still to be determined, sothe minimum length for string dtypes is 1.
hypothesis.extra.numpy.
array_dtypes
(subtype_strategy=scalar_dtypes(), *, min_size=1, max_size=5, allow_subarrays=False)[source]¶Return a strategy for generating array (compound) dtypes, with membersdrawn from the given subtype strategy.
hypothesis.extra.numpy.
nested_dtypes
(subtype_strategy=scalar_dtypes(), *, max_leaves=10, max_itemsize=None)[source]¶Return the most-general dtype strategy.
Elements drawn from this strategy may be simple (from thesubtype_strategy), or several such values drawn from with allow_subarrays=True
. Subdtypes in anarray dtype may be nested to any depth, subject to the max_leavesargument.
hypothesis.extra.numpy.
valid_tuple_axes
(ndim, *, min_size=0, max_size=None)[source]¶Return a strategy for generating permissible tuple-values for theaxis
argument for a numpy sequential function (e.g.numpy.sum()
), given an array of the specifieddimensionality.
All tuples will have an length >= min_size and <= max_size. The defaultvalue for max_size is ndim
.
Examples from this strategy shrink towards an empty tuple, which rendermost sequential functions as no-ops.
The following are some examples drawn from this strategy.
valid_tuple_axes
can be joined with other strategies to generateany type of valid axis object, i.e. integers, tuples, and None
:
hypothesis.extra.numpy.
broadcastable_shapes
(shape, *, min_dims=0, max_dims=None, min_side=1, max_side=None)[source]¶Return a strategy for generating shapes that are broadcast-compatiblewith the provided shape.
Examples from this strategy shrink towards a shape with length min_dims
.The size of an aligned dimension shrinks towards size 1
. Thesize of an unaligned dimension shrink towards min_side
.
shape
a tuple of integersmin_dims
The smallest length that the generated shape can possess.max_dims
The largest length that the generated shape can possess.The default-value formax_dims
ismin(32,max(len(shape),min_dims)+2)
.min_side
The smallest size that an unaligned dimension can possess.max_side
The largest size that an unaligned dimension can possess.The default value is 2 + ‘size-of-largest-aligned-dimension’.
The following are some examples drawn from this strategy.
hypothesis.extra.numpy.
mutually_broadcastable_shapes
(*, num_shapes=not_set, signature=not_set, base_shape=(), min_dims=0, max_dims=None, min_side=1, max_side=None)[source]¶Return a strategy for generating a specified number of shapes, N, that aremutually-broadcastable with one another and with the provided “base-shape”.
The strategy will generate a named-tuple of:
input_shapes: the N generated shapes
result_shape: the resulting shape, produced by broadcasting theN shapes with the base-shape
Each shape produced from this strategy shrinks towards a shape with lengthmin_dims
. The size of an aligned dimension shrinks towards being havinga size of 1. The size of an unaligned dimension shrink towards min_side
.
num_shapes
The number of mutually broadcast-compatible shapes to generate.base-shape
The shape against which all generated shapes can broadcast.The default shape is empty, which corresponds to a scalar and thus does notconstrain broadcasting at all.min_dims
The smallest length that any generated shape can possess.max_dims
The largest length that any generated shape can possess.It cannot exceed 32, which is the greatest supported dimensionality for anumpy array. The default-value formax_dims
is2+max(len(shape),min_dims)
, capped at 32.min_side
The smallest size that an unaligned dimension can possess.max_side
The largest size that an unaligned dimension can possess.The default value is 2 + ‘size-of-largest-aligned-dimension’.
The following are some examples drawn from this strategy.
Use with Generalised Universal Function signatures
A universal function (or ufunc for short) is a functionthat operates on ndarrays in an element-by-element fashion, supporting arraybroadcasting, type casting, and several other standard features.A generalised ufunc operates onsub-arrays rather than elements, based on the “signature” of the function.Compare e.g. numpy.add
(ufunc) to numpy.matmul
(gufunc).
To generate shapes for a gufunc, you can pass the signature
argument instead ofnum_shapes
. This must be a gufunc signature string; which you can write byhand or access as e.g. np.matmul.signature
on generalised ufuncs.
In this case, the side
arguments are applied to the ‘core dimensions’ as well,ignoring any frozen dimensions. base_shape
and the dims
arguments areapplied to the ‘loop dimensions’, and if necessary, the dimensionality of eachshape is silently capped to respect the 32-dimension limit.
The generated result_shape
is the real result shape of applying the gufuncto arrays of the generated input_shapes
, even where this is different tobroadcasting the loop dimensions.
Sbk generations 2012 pc download. gufunc-compatible shapes shrink their loop dimensions as above, towards omittingoptional core dimensions, and smaller-size core dimensions.
hypothesis.extra.numpy.
basic_indices
(shape, *, min_dims=0, max_dims=None, allow_newaxis=False, allow_ellipsis=True)[source]¶The basic_indices
strategy generates basic indexesfor arrays of the specified shape, which may include dimensions of size zero.
It generates tuples containing some mix of integers, slice
objects,..
(Ellipsis), and numpy.newaxis
; which when used to index ashape
-shaped array will produce either a scalar or a shared-memory view.When a length-one tuple would be generated, this strategy may instead returnthe element which will index the first axis, e.g. Aqw how to download le bot 8.4 2018. 5
instead of (5,)
.
shape
: the array shape that will be indexed, as a tuple of integers >= 0.This must be at least two-dimensional for a tuple to be a valid basic index;for one-dimensional arrays use instead.min_dims
: the minimum dimensionality of the resulting view from use ofthe generated index. Whenmin_dims0
, scalars and zero-dimensionalarrays are both allowed.max_dims
: the maximum dimensionality of the resulting view.If not specified, it defaults tomax(len(shape),min_dims)+2
.allow_ellipsis
: whether..`
is allowed in the index.allow_newaxis
: whethernumpy.newaxis
is allowed in the index.
Note that the length of the generated tuple may be anywhere between zeroand min_dims
. It may not match the length of shape
, or even thedimensionality of the array view resulting from its use!
hypothesis.extra.numpy.
integer_array_indices
(shape, *, result_shape=array_shapes(), dtype='int')[source]¶Return a search strategy for tuples of integer-arrays that, when usedto index into an array of shape shape
, given an array whose shapewas drawn from result_shape
.
Examples from this strategy shrink towards the tuple of index-arrays:
shape
a tuple of integers that indicates the shape of the array,whose indices are being generated.result_shape
a strategy for generating tuples of integers, whichdescribe the shape of the resulting index arrays. The default is. The shape drawn fromthis strategy determines the shape of the array that will be producedwhen the corresponding example frominteger_array_indices
is usedas an index.dtype
the integer data type of the generated index-arrays. Negativeinteger indices can be generated if a signed integer type is specified.
Recall that an array can be indexed using a tuple of integer-arrays toaccess its members in an arbitrary order, producing an array with anarbitrary shape. For example:
Note that this strategy does not accommodate all variations of so-called‘advanced indexing’, as prescribed by NumPy’s nomenclature. Combinationsof basic and advanced indexes are too complex to usefully define in astandard strategy; we leave application-specific strategies to the user.Advanced-boolean indexing can be defined as arrays(shape=..,dtype=bool)
,and is similarly left to the user.
pandas¶
Hypothesis provides strategies for several of the core pandas data types:pandas.Index
, pandas.Series
and pandas.DataFrame
.
The general approach taken by the pandas module is that there are multiplestrategies for generating indexes, and all of the other strategies take thenumber of entries they contain from their index strategy (with sensible defaults).So e.g. a Series is specified by specifying its numpy.dtype
(and/ora strategy for generating elements for it).
hypothesis.extra.pandas.
indexes
(*, elements=None, dtype=None, min_size=0, max_size=None, unique=True)[source]¶Provides a strategy for producing a pandas.Index
.
Arguments:
elements is a strategy which will be used to generate the individualvalues of the index. If None, it will be inferred from the dtype. Note:even if the elements strategy produces tuples, the generated valuewill not be a MultiIndex, but instead be a normal index whose elementsare tuples.
dtype is the dtype of the resulting index. If None, it will be inferredfrom the elements strategy. At least one of dtype or elements must beprovided.
min_size is the minimum number of elements in the index.
max_size is the maximum number of elements in the index. If None then itwill default to a suitable small size. If you want larger indexes youshould pass a max_size explicitly.
unique specifies whether all of the elements in the resulting indexshould be distinct.
hypothesis.extra.pandas.
range_indexes
(min_size=0, max_size=None)[source]¶Provides a strategy which generates an Index
whosevalues are 0, 1, …, n for some n.
Arguments:
min_size is the smallest number of elements the index can have.
max_size is the largest number of elements the index can have. If Noneit will default to some suitable value based on min_size.
hypothesis.extra.pandas.
series
(*, elements=None, dtype=None, index=None, fill=None, unique=False)[source]¶Provides a strategy for producing a pandas.Series
.
Numpy Min Max Value
Arguments:
elements: a strategy that will be used to generate the individualvalues in the series. If None, we will attempt to infer a suitabledefault from the dtype.
dtype: the dtype of the resulting series and may be any valuethat can be passed to
numpy.dtype
. If None, will usepandas’s standard behaviour to infer it from the type of the elementsvalues. Note that if the type of values that comes out of yourelements strategy varies, then so will the resulting dtype of theseries.index: If not None, a strategy for generating indexes for theresulting Series. This can generate either
pandas.Index
objects or any sequence of values (which will be passed to theIndex constructor).You will probably find it most convenient to use the or function to producevalues for this argument.
Usage:
hypothesis.extra.pandas.
column
(name=None, elements=None, dtype=None, fill=None, unique=False)[source]¶Data object for describing a column in a DataFrame.
Numpy Min Max Average
Arguments:
name: the column name, or None to default to the column position. Mustbe hashable, but can otherwise be any value supported as a pandas columnname.
elements: the strategy for generating values in this column, or Noneto infer it from the dtype.
dtype: the dtype of the column, or None to infer it from the elementstrategy. At least one of dtype or elements must be provided.
fill: A default value for elements of the column. See for a full explanation.
unique: If all values in this column should be distinct.
hypothesis.extra.pandas.
columns
(names_or_number, *, dtype=None, elements=None, fill=None, unique=False)[source]¶A convenience function for producing a list of objectsof the same general shape.
The names_or_number argument is either a sequence of values, theelements of which will be used as the name for individual columnobjects, or a number, in which case that many unnamed columns willbe created. All other arguments are passed through verbatim tocreate the columns.
hypothesis.extra.pandas.
data_frames
(columns=None, *, rows=None, index=None)[source]¶Numpy Min Max Mean Median
Provides a strategy for producing a pandas.DataFrame
.
Arguments:
columns: An iterable of objects describing the shapeof the generated DataFrame.
rows: A strategy for generating a row object. Should generateeither dicts mapping column names to values or a sequence mappingcolumn position to the value in that position (note that unlike the
pandas.DataFrame
constructor, single values are not allowedhere. Passing e.g. an integer is an error, even if there is only onecolumn).At least one of rows and columns must be provided. If both areprovided then the generated rows will be validated against thecolumns and an error will be raised if they don’t match.
Caveats on using rows:
In general you should prefer using columns to rows, and only userows if the columns interface is insufficiently flexible todescribe what you need - you will get better performance andexample quality that way.
If you provide rows and not columns, then the shape and dtype ofthe resulting DataFrame may vary. e.g. if you have a mix of intand float in the values for one column in your row entries, thecolumn will sometimes have an integral dtype and sometimes a float.
index: If not None, a strategy for generating indexes for theresulting DataFrame. This can generate either
pandas.Index
objects or any sequence of values (which will be passed to theIndex constructor).You will probably find it most convenient to use the or function to producevalues for this argument.
Usage:
The expected usage pattern is that you use and to specify a fixed shape of the DataFrame you want asfollows. For example the following gives a two column data frame:
If you want the values in different columns to interact in some way youcan use the rows argument. For example the following gives a two columnDataFrame where the value in the first column is always at most the valuein the second:
You can also combine the two:
(Note that the column dtype must still be specified and will not beinferred from the rows. This restriction may be lifted in future).
Combining rows and columns has the following behaviour:
The column names and dtypes will be used.
If the column is required to be unique, this will be enforced.
Any values missing from the generated rows will be provided using thecolumn’s fill.
Any values in the row not present in the column specification (ifdicts are passed, if there are keys with no corresponding column name,if sequences are passed if there are too many items) will result inInvalidArgument being raised.
Supported versions¶
There is quite a lot of variation between pandas versions. We onlycommit to supporting the latest version of pandas, but older minor versions aresupported on a “best effort” basis. Hypothesis is currently tested againstand confirmed working with every Pandas minor version from 0.25 through to 1.1.
Numpy Min Max Of Array
Releases that are not the latest patch release of their minor version are nottested or officially supported, but will probably also work unless you hit apandas bug.