🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
∑MathIntermediate

Convex Sets & Functions

Key Points

  • •
    A set is convex if every line segment between any two of its points lies entirely inside the set.
  • •
    A function is convex if its value on any weighted average of points is no more than the same weighted average of its values at those points.
  • •
    Convexity guarantees nice properties like unique global minima and efficient optimization algorithms.
  • •
    The convex hull is the smallest convex set containing all given points and can be built in O(n log n) time.
  • •
    First- and second-order tests use gradients and Hessians to certify convexity of differentiable functions.
  • •
    Jensen’s inequality is the signature inequality that characterizes convex functions.
  • •
    Checking whether a simple polygon is convex can be done by verifying the sign of cross products around its boundary.
  • •
    Discrete convexity on sequences can be checked with nonnegative second differences and enables O(log n) search for the minimum.

Prerequisites

  • →Vectors and Euclidean geometry — Understanding points, lines, and distances is essential for geometric convexity and cross products.
  • →Dot and cross products — Orientation tests and curvature intuition rely on these operations.
  • →Calculus (derivatives, Hessians) — First- and second-order convexity tests use gradients and second derivatives.
  • →Linear algebra — Positive semidefinite matrices and affine/linear mappings are core to convex analysis.
  • →Algorithm analysis — To understand O(n \log n) hull construction and O(n) convexity checks.
  • →Sorting and comparison-based algorithms — Convex hull algorithms depend on sorting points stably and efficiently.
  • →Numerical robustness — Avoid overflow/precision issues in cross products and difference computations.
  • →Sequences and finite differences — Discrete convexity uses second differences analogous to second derivatives.

Detailed Explanation

Tap terms for definitions

01Overview

Convexity is a geometric and analytic property that simplifies reasoning about shapes and functions. A set in a vector space is convex if, for any two points in the set, the entire line segment connecting them also lies within the set. This seemingly simple requirement has powerful consequences: convex sets have no “dents,” which makes many geometric operations and algorithms stable and predictable. For functions, convexity means the graph of the function lies below any chord connecting two points on the graph. This property rules out local minima that are not global, enabling reliable optimization. In computer science and computational geometry, convexity underlies algorithms such as convex hull construction, collision detection, and linear programming. In optimization and machine learning, convex objective functions lead to algorithms that converge to the best solution without getting trapped in local minima. Convexity is also robust under common operations such as intersection of convex sets and nonnegative linear combinations of convex functions. Together, these features make convexity a central concept linking geometry, analysis, and algorithms.

02Intuition & Analogies

Imagine stretching a rubber band around a set of nails on a board. When you let go, the band snaps tightly to form a shape that has no inward dents—the convex hull. Any straight line you draw between two points inside this shape stays entirely inside; that’s the heart of convexity. If you think of a bowl-shaped surface, like a salad bowl, a marble placed anywhere on it will roll to the bottom—there’s only one lowest point. This is the intuition behind convex functions: they’re bowl-shaped (possibly with flat bottoms), so any local minimum is also the global minimum. Now picture taking an average of two locations: if you’re inside a convex park and you walk halfway from one bench to another, you’re still inside the park. Likewise, for convex functions, the value at an average point is no more than the average of the values—graphically, the curve lies below the chord between two points. For discrete data (like prices over days), convexity means the data curve bends upwards but never forms a “V” pointing down; the second differences are nonnegative, analogous to nonnegative second derivatives in calculus. In algorithms, convexity acts like a promise: there are no traps or hidden pockets. This promise lets us prune search spaces confidently (as in ternary search for unimodal sequences) and combine pieces safely (as in intersections of convex regions). When sets are convex, simple orientation checks using cross products suffice to verify convexity; when functions are convex, gradient and curvature (Hessian) behave predictably, letting us reason with linear approximations and quadratic lower bounds.

03Formal Definition

A set C ⊆ Rd is convex if for all x, y ∈ C and all λ ∈ [0,1], the convex combination λ x + (1-λ) y ∈ C. Equivalently, C contains the entire line segment between any two of its points. The convex hull of a set S is the smallest convex set containing S and can be expressed as all finite convex combinations of points in S. A function f: Rd → R is convex if for all x, y in its domain and all λ ∈ [0,1], f(λ x + (1-λ) y) ≤ λ f(x) + (1-λ) f(y). If the inequality is strict for distinct x, y and λ ∈ (0,1), f is strictly convex. If f is differentiable, convexity is equivalent to the first-order condition f(y) ≥ f(x) + ∇ f(x)^⊤ (y - x) for all x, y, meaning the function lies above each of its tangents. If f is twice differentiable, convexity is implied by positive semidefiniteness of the Hessian: ∇2 f(x) ⪰ 0 for all x in the domain. Strong convexity with parameter m>0 strengthens the first-order condition to include a quadratic term, guaranteeing curvature bounded away from zero and uniqueness and stability of minimizers. Convex sets admit supporting hyperplanes at boundary points, and disjoint convex sets can often be separated by a hyperplane under mild conditions.

04When to Use

Use convex sets when modeling feasible regions where you want interpolation to remain feasible—for example, blending two valid solutions should remain valid. This is common in linear programming, resource allocation, and safe regions in motion planning. Use convex hulls to summarize and bound point clouds, detect collisions via bounding volumes, compute shape descriptors, and preprocess data for more complex geometric queries. Use convex functions when you need optimization with guarantees: if the objective is convex and the constraints form a convex feasible set, any local optimum is global, and gradient-based methods converge reliably. In data fitting and machine learning, least squares and logistic regression are convex, enabling scalable solvers. In signal processing, total variation and L1-regularized problems leverage convexity for sparsity and robustness. In discrete settings, use discrete convexity (nonnegative second differences) to identify unimodal sequences and perform O(\log n) searches for extrema. In computational geometry, test polygon convexity to choose faster algorithms; many algorithms run faster or are simpler for convex polygons than for arbitrary ones. Use supporting hyperplanes for approximation, collision checks, and to construct separating boundaries. Projection onto convex sets is central in alternating projection methods, proximal algorithms, and constraint handling in iterative solvers.

⚠️Common Mistakes

A frequent mistake is assuming a function or set is convex based on limited visual inspection; always verify via definitions: for sets, check line segments; for functions, apply Jensen’s inequality, gradient tests, or Hessian PSD checks. Another error is conflating unimodality with convexity: every convex function on a line is unimodal, but not every unimodal function is convex; algorithms like ternary search work for unimodal functions but convexity grants stronger inequalities and stability. In computational geometry, people sometimes test polygon convexity with cross products but forget to handle collinearity or consistent vertex ordering, leading to false negatives or positives; decide whether collinear edges are allowed and treat zero cross products carefully. Numerical issues can bite: computing cross products or second differences with large coordinates can overflow 32-bit integers; prefer 64-bit or wider types and be cautious with floating-point comparisons by tolerating small epsilons. Another pitfall is assuming intersections of convex sets are always strictly convex; intersections remain convex but can introduce flat faces, affecting algorithms that rely on strict curvature. In optimization, overlooking domain convexity is common: a convex formula restricted to a nonconvex domain loses its guarantees. Finally, mixing up strong convexity with mere convexity can lead to incorrect convergence claims; rates that use parameters m and L (curvature and smoothness) require verifying those constants.

Key Formulas

Convex set definition

C is convex ⟺∀x,y∈C, ∀λ∈[0,1], λx+(1−λ)y∈C

Explanation: A set is convex if it contains the line segment between any two of its points. This is the foundational property used in geometry and optimization.

Convex function definition (Jensen form)

f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y), ∀x,y, λ∈[0,1]

Explanation: A function is convex if it lies below the straight line between any two points on its graph. This inequality generalizes to finite convex combinations.

Convex hull

conv(S)={i=1∑k​λi​xi​:xi​∈S, λi​≥0, i=1∑k​λi​=1, k∈N}

Explanation: The convex hull consists of all convex combinations of finitely many points from S. It is the smallest convex set containing S.

First-order condition for convexity

f(y)≥f(x)+∇f(x)⊤(y−x)

Explanation: For differentiable convex functions, each tangent plane underestimates the function. This linear lower bound characterizes convexity.

Second-order (Hessian) test

∇2f(x)⪰0 ⇒ f is convex

Explanation: If the Hessian is positive semidefinite everywhere on a convex domain, the function is convex. This captures the idea of nonnegative curvature.

Jensen’s inequality (finite form)

f(i=1∑k​λi​xi​)≤i=1∑k​λi​f(xi​),λi​≥0, i=1∑k​λi​=1

Explanation: Extends the two-point convexity inequality to many points. It is used to bound expectations and averages in probability and analysis.

Subgradient inequality

f(y)≥f(x)+g⊤(y−x),g∈∂f(x)

Explanation: For nondifferentiable convex functions, any subgradient defines a supporting hyperplane lying below the function. This generalizes the gradient condition.

Strong convexity

f(y)≥f(x)+∇f(x)⊤(y−x)+2m​∥y−x∥22​

Explanation: Strongly convex functions have curvature bounded away from zero by m. This yields uniqueness of minimizers and faster convergence of algorithms.

L-smoothness (Lipschitz gradient)

∥∇f(x)−∇f(y)∥2​≤L∥x−y∥2​

Explanation: Bounds how quickly the gradient can change. Combined with strong convexity, it controls convergence rates of gradient methods.

Projection onto a convex set

PC​(x)=argy∈Cmin​∥x−y∥2​

Explanation: The projection is the closest point in C to x. In convex sets this point exists and is unique, enabling projection-based algorithms.

Minkowski sum

A⊕B={a+b:a∈A,b∈B}

Explanation: The sum of two convex sets is convex. This operation is used in motion planning and morphology.

Carathéodory’s theorem (informal form)

x∈conv(S)⇒x=i=1∑d+1​λi​vi​, vi​∈S, λi​≥0, ∑λi​=1

Explanation: In Rd, any point in a convex hull can be represented using at most d+1 points. This limits combination complexity.

Convex hull complexity (Andrew’s algorithm)

T(n)=O(nlogn), S(n)=O(n)

Explanation: The monotone chain algorithm sorts points and builds the hull in linear passes, leading to n log n time and linear extra space.

Discrete convexity (second differences)

Δ2f[i]=f[i−1]−2f[i]+f[i+1]≥0

Explanation: A sequence is discretely convex if all second differences are nonnegative. This implies a single basin (unimodality) enabling efficient searches.

Complexity Analysis

The convex hull example uses Andrew’s monotone chain algorithm. Sorting n points by x (and then y) takes O(n log n) time; constructing the lower and upper hulls each requires a single pass where points are added and occasionally popped using an orientation test, for O(n) additional time. Thus, total time is O(n log n), and the algorithm uses O(n) space to store the sorted points and hull. The polygon convexity checker runs in O(n) time by iterating once over the edges and computing cross products of consecutive edge vectors; it uses O(1) auxiliary space beyond the input polygon. Care must be taken with integer overflow in cross products for large coordinates; using 64-bit integers or wider intermediates mitigates this. For discrete convex sequences, checking convexity via second differences is O(n) time and O(1) space. Finding the minimum index in a unimodal (discretely convex) array with a ternary-search-like approach takes O(log n) time and O(1) space, since each comparison halves (or reduces by a constant fraction) the search interval. If the sequence is not convex (or at least unimodal), the O(log n) guarantee does not hold, and a fallback O(n) scan is necessary. In all cases, constant factors are low, with core operations dominated by arithmetic (cross products, difference computations) and comparisons. Numerical stability considerations suggest avoiding exact equality checks with floating-point inputs; when using integers, remain within safe bounds or promote to 128-bit intermediates for cross products.

Code Examples

Convex Hull via Andrew’s Monotone Chain (2D)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Point {
5 long long x, y;
6 bool operator<(const Point& other) const {
7 if (x != other.x) return x < other.x;
8 return y < other.y;
9 }
10 bool operator==(const Point& other) const {
11 return x == other.x && y == other.y;
12 }
13};
14
15// Cross product of OA x OB (with O as origin)
16static inline __int128 cross128(const Point& O, const Point& A, const Point& B) {
17 // Use 128-bit intermediate to avoid overflow for large coordinates
18 __int128 ax = (__int128)A.x - O.x;
19 __int128 ay = (__int128)A.y - O.y;
20 __int128 bx = (__int128)B.x - O.x;
21 __int128 by = (__int128)B.y - O.y;
22 return ax * by - ay * bx;
23}
24
25vector<Point> convex_hull(vector<Point> pts) {
26 int n = (int)pts.size();
27 if (n <= 1) return pts;
28
29 sort(pts.begin(), pts.end());
30 pts.erase(unique(pts.begin(), pts.end()), pts.end());
31 n = (int)pts.size();
32 if (n <= 1) return pts;
33
34 vector<Point> lower, upper;
35
36 // Build lower hull
37 for (const auto& p : pts) {
38 while (lower.size() >= 2) {
39 __int128 cr = cross128(lower[lower.size()-2], lower[lower.size()-1], p);
40 if (cr <= 0) lower.pop_back(); // Remove non-left turns to keep hull convex
41 else break;
42 }
43 lower.push_back(p);
44 }
45
46 // Build upper hull
47 for (int i = n - 1; i >= 0; --i) {
48 const auto& p = pts[i];
49 while (upper.size() >= 2) {
50 __int128 cr = cross128(upper[upper.size()-2], upper[upper.size()-1], p);
51 if (cr <= 0) upper.pop_back();
52 else break;
53 }
54 upper.push_back(p);
55 }
56
57 // Concatenate lower and upper to form full hull; last point of each is duplicate of the first of the other
58 lower.pop_back();
59 upper.pop_back();
60
61 vector<Point> hull;
62 hull.reserve(lower.size() + upper.size());
63 hull.insert(hull.end(), lower.begin(), lower.end());
64 hull.insert(hull.end(), upper.begin(), upper.end());
65
66 return hull; // Counterclockwise order
67}
68
69int main() {
70 ios::sync_with_stdio(false);
71 cin.tie(nullptr);
72
73 vector<Point> pts = {{0,0},{2,1},{1,1},{2,2},{1,2},{3,0},{0,3},{3,3}};
74 auto hull = convex_hull(pts);
75
76 cout << "Hull has " << hull.size() << " points:\n";
77 for (auto& p : hull) cout << p.x << " " << p.y << "\n";
78 return 0;
79}
80

This program computes the convex hull of 2D points using Andrew’s monotone chain. It sorts the points, builds the lower and upper hulls using a left-turn test (cross product), and concatenates them. Using 128-bit intermediates avoids overflow during cross product for large coordinates. The resulting hull is listed counterclockwise without repeating the starting point.

Time: O(n log n)Space: O(n)
Check if a Simple Polygon is Convex (Orientation Consistency)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Point { long long x, y; };
5
6static inline __int128 cross128(const Point& A, const Point& B, const Point& C) {
7 // Cross of AB x AC using 128-bit intermediates
8 __int128 abx = (__int128)B.x - A.x;
9 __int128 aby = (__int128)B.y - A.y;
10 __int128 acx = (__int128)C.x - A.x;
11 __int128 acy = (__int128)C.y - A.y;
12 return abx * acy - aby * acx;
13}
14
15bool isConvexPolygon(const vector<Point>& P, bool allow_collinear_on_edges = true) {
16 int n = (int)P.size();
17 if (n < 3) return false; // not a polygon
18 int sign = 0; // +1 for left turns, -1 for right turns
19
20 for (int i = 0; i < n; ++i) {
21 const Point& A = P[i];
22 const Point& B = P[(i+1)%n];
23 const Point& C = P[(i+2)%n];
24 __int128 cr = cross128(A, B, C);
25 if (cr == 0) {
26 if (!allow_collinear_on_edges) return false; // flat angle not allowed
27 continue; // ignore flat turns if allowed
28 }
29 int cur = (cr > 0) ? +1 : -1;
30 if (sign == 0) sign = cur; // set initial orientation
31 else if (cur != sign) return false; // mixed turns => concave
32 }
33 return true; // all turns consistent (or flat when allowed)
34}
35
36int main() {
37 ios::sync_with_stdio(false);
38 cin.tie(nullptr);
39
40 // Convex square (counterclockwise)
41 vector<Point> square = {{0,0},{2,0},{2,2},{0,2}};
42 cout << (isConvexPolygon(square) ? "convex" : "concave") << "\n";
43
44 // Concave pentagon (a simple arrow shape)
45 vector<Point> concave = {{0,0},{3,0},{2,1},{3,2},{0,2}};
46 cout << (isConvexPolygon(concave) ? "convex" : "concave") << "\n";
47
48 // Polygon with collinear edge allowed
49 vector<Point> col = {{0,0},{2,0},{4,0},{4,2},{0,2}}; // three collinear along bottom
50 cout << (isConvexPolygon(col, true) ? "convex" : "concave") << "\n";
51 cout << (isConvexPolygon(col, false) ? "convex" : "concave") << "\n";
52
53 return 0;
54}
55

The function checks convexity of a simple polygon given in order (clockwise or counterclockwise) by ensuring all consecutive triples of vertices produce turns with consistent sign. If allow_collinear_on_edges is true, zero cross products (flat angles) are permitted; otherwise, any collinearity invalidates convexity. This method assumes the polygon is simple (non-self-intersecting).

Time: O(n)Space: O(1)
Discrete Convexity Check and O(log n) Argmin in a Convex (Unimodal) Sequence
1#include <bits/stdc++.h>
2using namespace std;
3
4// Check discrete convexity via nonnegative second differences
5bool isConvexSequence(const vector<long long>& f) {
6 int n = (int)f.size();
7 if (n < 3) return true; // trivially convex for length < 3
8 for (int i = 1; i + 1 < n; ++i) {
9 // Second difference: f[i-1] - 2*f[i] + f[i+1] >= 0
10 __int128 s = (__int128)f[i-1] - 2*(__int128)f[i] + (__int128)f[i+1];
11 if (s < 0) return false;
12 }
13 return true;
14}
15
16// Find index of minimum in a unimodal/convex discrete sequence using ternary-like search
17int argminUnimodal(const vector<long long>& f) {
18 int n = (int)f.size();
19 int l = 0, r = n - 1;
20 while (r - l > 3) {
21 int m1 = l + (r - l) / 3;
22 int m2 = r - (r - l) / 3;
23 if (f[m1] <= f[m2]) {
24 r = m2 - 1; // minimum is in [l, m2-1]
25 } else {
26 l = m1 + 1; // minimum is in [m1+1, r]
27 }
28 }
29 int best = l;
30 for (int i = l + 1; i <= r; ++i) if (f[i] < f[best]) best = i;
31 return best;
32}
33
34int main() {
35 ios::sync_with_stdio(false);
36 cin.tie(nullptr);
37
38 // A convex sequence sampled from i^2 with some shift
39 vector<long long> g = {16, 9, 4, 1, 0, 1, 4, 9, 16}; // minimum at index 4
40
41 cout << (isConvexSequence(g) ? "convex" : "not convex") << "\n";
42 int idx = argminUnimodal(g);
43 cout << "argmin index: " << idx << ", value: " << g[idx] << "\n";
44
45 // Non-convex sequence example
46 vector<long long> h = {0, 2, 1, 2, 0};
47 cout << (isConvexSequence(h) ? "convex" : "not convex") << "\n";
48 // For non-convex sequences, argminUnimodal may fail to give guarantees; use linear scan instead
49 int best = min_element(h.begin(), h.end()) - h.begin();
50 cout << "(fallback) argmin index: " << best << ", value: " << h[best] << "\n";
51
52 return 0;
53}
54

This program first verifies discrete convexity of a sequence by checking that all second differences are nonnegative. If a sequence is convex (hence unimodal), we can find the minimum index in O(log n) time using a ternary-search-style strategy that shrinks the search interval based on comparing values at two interior points. For sequences that are not convex (or at least unimodal), guarantees are lost and a linear scan is safer.

Time: Convexity check: O(n); argmin search: O(log n)Space: O(1)
#convex set#convex function#convex hull#jensen inequality#hessian psd#subgradient#supporting hyperplane#minkowski sum#discrete convexity#ternary search#polygon convexity#andrew monotone chain#caratheodory theorem#projection convex set