Log-likelihood function

Logistic log loss is convex

Continuous

Differentiable

Becuz log also monotonically increasing argmax easy

should maximize likelihood function, so negative value when it is logged

is log likelihood and is negative log likelihood

it measures how well the parameters fit the observed data. The notation used to represent the likelihood function is L(θ), where θ represents the parameters of the model, and X and y represent the data. The likelihood function is defined as the conditional probability of the observed data given the values of the parameters of the model: .

j function usually means negative log likelihood

NLL (Negative log likelihood)

For example with

Bernoulli Distribution

Expanding the probability

Grouping terms for

where ,

MLE is given by which is the empirical fraction of heads

Log-likelihood function

Logistic log loss is convex

Becuz log also monotonically increasing argmax easy

NLL (Negative log likelihood)

Recommendations