Beginner Explanation
Imagine you’re looking for your favorite toy in a big toy box. You pull out toys one by one, and each time you find a toy you like, you feel happy. Mean Average Precision (MAP) is like keeping track of how many times you find your favorite toys quickly compared to all the toys you pulled out. It helps you see how good you are at finding the toys you want among all the toys in the box, giving you a score based on your success at finding them.Technical Explanation
Mean Average Precision (MAP) is a metric used to evaluate the performance of information retrieval systems. It calculates the average precision for a set of queries and then averages these values across all queries. Precision at a certain rank is defined as the number of relevant documents retrieved divided by the total number of documents retrieved up to that rank. MAP is computed as follows: for each query, compute the average precision (AP), and then take the mean of these AP values across all queries. In Python, you can compute MAP as follows: “`python import numpy as np def average_precision(y_true, y_scores): # Sort scores and true labels by scores sorted_indices = np.argsort(y_scores)[::-1] y_true = y_true[sorted_indices] precision_sum = 0 relevant_count = 0 for i, is_relevant in enumerate(y_true): if is_relevant: relevant_count += 1 precision_sum += relevant_count / (i + 1) return precision_sum / relevant_count if relevant_count > 0 else 0 # Example usage queries = [[1, 0, 1, 0], [1, 1, 0, 0]] # True relevance scores = [[0.9, 0.1, 0.8, 0.4], [0.6, 0.7, 0.2, 0.3]] # Predicted scores map_score = np.mean([average_precision(np.array(q), np.array(s)) for q, s in zip(queries, scores)]) print(f’Mean Average Precision: {map_score}’) “`Academic Context
Mean Average Precision (MAP) is a widely-used metric in information retrieval and machine learning, particularly in evaluating ranking systems. It was introduced in the context of the TREC (Text Retrieval Conference) evaluations and is grounded in the principles of precision and recall. The mathematical foundation of MAP involves calculating the average precision for each query and taking the mean across all queries. The key papers discussing MAP include ‘Evaluating Inference in Information Retrieval’ by Buckley et al. (1995) and ‘An Evaluation of Information Retrieval Models’ by M. Sanderson, which provide in-depth discussions on precision, recall, and ranking metrics. MAP is particularly useful in scenarios where the ranking of results is critical, such as search engines and recommendation systems.Code Examples
Example 1:
import numpy as np
def average_precision(y_true, y_scores):
# Sort scores and true labels by scores
sorted_indices = np.argsort(y_scores)[::-1]
y_true = y_true[sorted_indices]
precision_sum = 0
relevant_count = 0
for i, is_relevant in enumerate(y_true):
if is_relevant:
relevant_count += 1
precision_sum += relevant_count / (i + 1)
return precision_sum / relevant_count if relevant_count > 0 else 0
# Example usage
queries = [[1, 0, 1, 0], [1, 1, 0, 0]] # True relevance
scores = [[0.9, 0.1, 0.8, 0.4], [0.6, 0.7, 0.2, 0.3]] # Predicted scores
map_score = np.mean([average_precision(np.array(q), np.array(s)) for q, s in zip(queries, scores)])
print(f'Mean Average Precision: {map_score}')
Example 2:
# Sort scores and true labels by scores
sorted_indices = np.argsort(y_scores)[::-1]
y_true = y_true[sorted_indices]
Example 3:
precision_sum = 0
relevant_count = 0
for i, is_relevant in enumerate(y_true):
if is_relevant:
relevant_count += 1
precision_sum += relevant_count / (i + 1)
return precision_sum / relevant_count if relevant_count > 0 else 0
Example 4:
import numpy as np
def average_precision(y_true, y_scores):
# Sort scores and true labels by scores
sorted_indices = np.argsort(y_scores)[::-1]
Example 5:
def average_precision(y_true, y_scores):
# Sort scores and true labels by scores
sorted_indices = np.argsort(y_scores)[::-1]
y_true = y_true[sorted_indices]
View Source: https://arxiv.org/abs/2511.16654v1