Theory: Median Center

Outliers and the Median

As an economy develops, income and wealth tend to concentrate in specific individuals or groups. When using the mean income as a standard, the presence of high-income earners increases the average, making it appear higher than what most people perceive as “typical.”

Although the mean is the most widely used measure of central tendency, social and natural phenomena often include outliers, and if we make interpretations or decisions based on such distorted central values, the outcomes can deviate from reality.

For example, in public policy, decisions are often made for the benefit of the middle or lower-income groups rather than the wealthy. In such cases, the median income is used more frequently than the mean. The median, which is less sensitive to outliers, is a descriptive statistic that better reflects the central tendency of skewed data. It refers to the middle value when data is sorted from smallest to largest.

For instance, in the dataset 1, 5, 10, 17, 97, the median is 10. The mean, however, is 60, which does not reasonably represent the overall trend of the data.

Median Center

In spatial data, a concept similar to the median is the median center. Like the mean center, one could compute the median center by identifying the median values of the X and Y coordinates individually. However, this method is rarely used in spatial analysis.

Instead, spatial analysts typically define the median center as the point that minimizes the total distance all other points must travel to reach it. For example, imagine that a group of soldiers on leave need to gather in a single location. The optimal meeting point—the one that requires the shortest total travel distance from all their locations—is the median center.

To calculate it, one must compute the distances from all data points to a potential center and iteratively find the point where the total distance is minimized.

This concept is known as the geometric median, L1 center, or Weiszfeld-based median center. It calculates the \((x, y)\) coordinate that minimizes the total Euclidean distance to all other points, as shown below:

$$\min f(x_m, y_m) = \sum_{i=1}^{n} \sqrt{(x_i - x_m)^2 + (y_i - y_m)^2}, \quad \textit{Median Center} = (x_m, y_m)$$

This function minimizes the sum of absolute distances. Unlike the mean, it cannot be solved with a simple formula and requires an iterative optimization method. A widely used approach is the Weiszfeld algorithm, an iterative method for calculating the geometric median. It repeatedly computes the center that minimizes the total distance to all points in space.

Application Examples

  • Facility location optimization: Find a location that minimizes the total distance for customers
  • Disaster response analysis: Use the median center of emergency calls to select initial response points
  • Public facility distribution analysis: Analyze the spatial median of the population
  • Optimal logistics hub placement: Select locations that minimize delivery distances
  • Rescue center placement: Determine the center of clustered rescue requests in disaster scenarios
Learn More: Weiszfeld Algorithm
  1. Set an initial center point \((x^{(0)}, y^{(0)})\).
  2. Repeat the following update formulas:
    $$x^{(k+1)}=\frac{\sum_{i=1}^{n}\dfrac{x_i}{d_i^{(k)}}}{\sum_{i=1}^{n}\dfrac{1}{d_i^{(k)}}}, \quad y^{(k+1)}=\frac{\sum_{i=1}^{n}\dfrac{y_i}{d_i^{(k)}}}{\sum_{i=1}^{n}\dfrac{1}{d_i^{(k)}}}$$
    where \(d_i^{(k)}=\sqrt{(x^{(k)}-x_i)^2+(y^{(k)}-y_i)^2}\), i.e., the distance from the center at iteration \(k\) to point \(i\).
  3. Stop when the change in the center is sufficiently small (convergence).
Previous
Next Post »