Scalar is a physical quantity that is completely described by its magnitude.
Examples: Volume, speed, density etc.
A vector is an object that has both a magnitude and a direction. Geometrically, we can picture a vector as a directed line segment, whose length is the magnitude of the vector and with an arrow indicating the direction.
Example: N-tuple
In Machine learning we call ordered list of feature value attributes as vector.
A set in mathematics is a unordered collection of well defined and distinct objects.
Two sets can be "added" together. The union of A and B, denoted by A ∪ B, is the set of all things which are members of either A or B.
Example:
The intersection of A and B, denoted by A ∩ B, is the set of all things which are members of both A and B. If A ∩ B = ∅, then A and B are said to be disjoint.
Example:
Example: The sum of a function F(x) from x=0 to n, will be written as .
Example: The product of a function F(x) from x=0 to n, will be written as
Example: S' ← {x3 ∣ x ∈ S, x > 5}
A function f form a set X to Y is a mapping, such that each elemnt of X is mapped to a single element of Y.
For a function y=f(x)
f has a local minimum/maximum at a point c if f(x)≥f(c)/f(x)≤f(c) for every x in some open interval around c.
f has a global max if the function attains its maximum(similarly minimum value for min) in its whole domain.
The derivative of y with respect to x is defined as the change in y over the change in x. In mathematical terms:
Differentiation is a method of finding the derivative of a function.
For a funcion f:Rn→Rn, it's gradient ∇f:Rn→Rn is defined at the point p = (x1,x2,....xn) in n-dimensional space as the vector-
Example: Let f(x,y) = x2y. Then ∇f = (2xy,x2). So ∇f(3,2) = (12,9) or 12i + 9j.
A random variable is described informally as a variable whose values depend on outcomes of a random phenomenon. There are two types of random variables, discrete and continuous.
A discrete random variable is one which may take on only a countable number of distinct values. For example: If X represents the number of times that the coin comes up heads, then X is a discrete random variable that can only have the values 0,1,2,3. No other value is possible for X.
The probability distribution of a discrete random variable is a list of probabilities associated with each of its possible values. It is also sometimes called the probability function or the probability mass function.
Suppose a random variable X may take n different values, with the probability that X = xi defined to be P(X = xi) = pi. The probabilities pi for i=1 to n, must satisfy the following:
For example: Lets consider a probability histohram:
The variable X can take values 1,2,3,4. For each outcome, we have a probability,
outcome | 1 | 2 | 3 | 4 |
---|---|---|---|---|
Probability | 0.1 | 0.3 | 0.4 | 0.2 |
The probability that X is equal to 2 or 3 is the sum of the two probabilities: P(X=2 or X=3) = P(X=2) + P(X=3) = 0.3 + 0.4 = 0.7.
A continuous random variable is one which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples include height, weight, the amount of sugar in an orange, the time required to run a mile.
A continuous random variable is not defined at specific values. Instead, it is defined over an interval of values, and is represented by the area under a curve. Probability density function(pdf) is a function that gives probabilty of a point.
For any continuous random variable with probability density function f(x), we have that:
For example:
The probability density fucnction(pdf) of this random variable looks like a bell shaed curve, and ∫ρ(x)dx = 1, where
The expected value of X, is a geeralization of the weighted average of the possible values that X can take, each value being weighted according to the probability of that event occurring.
E[x] = ∑i=1k xipi = x1p1 + x2p2 + ......+ xkpk
E[x] = ∫xf(x)dx .