Glossary

Bernoulli distribution
a named random variables used for binary outcomes; $1$ usually denotes the level of interest
categorical variable
a variable in a dataset that takes on not-mathable values
dataframe
a two dimensional data structure in the programming language R in which each row represents a new observation and each column represents a new variable
discrete random variable
a random variable that only takes on a countable set of values
independent and identically distributed
a description of data that suggests the data were randomly sampled (independent $\Rightarrow$ no two data points intentionally share anything in common, except) that they come from the same population (identically distributed).
individual/observation
a noun in the population of interest, not necessarily people
interpolate
estimate a number within a range of data
level
values that a categorical variable could take on
maximum likelihood estimator
A best quess
observation/individual
a noun in the population of interest, not necessarily people
parameter
a characteristic of a population, abstracted to non-dataarguments of probability density functions
percentile
the value in the support of the random variable that puts $p$% of the area under the probability density function to the left of it
population
the broader group of nouns of interest
probability density function
a function indexed by parameter(s) of interest, the shape of which theoretically describes the process of interest
proportion
AKA a mean, when applied to numerically encoded binary categorical data; unfortunately thought of as $successes / trials$.
random variable
a function from an event to a numerical value, e.g. $X(\{Caniformia\}) = 1$
sample
a subset of the population, ideally randomly collected
statistic
any function of data