Skip to content

PyLateInformationRetrievalEvaluator

This class evaluates an Information Retrieval (IR) setting. This is a direct extension of the InformationRetrievalEvaluator from the sentence-transformers library, only override the compute_metrices method to be compilatible with PyLate models (define assymetric encoding using is_query params and add padding).

Parameters

  • queries ('dict[str, str]')

  • corpus ('dict[str, str]')

  • relevant_docs ('dict[str, set[str]]')

  • corpus_chunk_size ('int') – defaults to 50000

  • mrr_at_k ('list[int]') – defaults to [10]

  • ndcg_at_k ('list[int]') – defaults to [10]

  • accuracy_at_k ('list[int]') – defaults to [1, 3, 5, 10]

  • precision_recall_at_k ('list[int]') – defaults to [1, 3, 5, 10]

  • map_at_k ('list[int]') – defaults to [100]

  • show_progress_bar ('bool') – defaults to False

  • batch_size ('int') – defaults to 32

  • name ('str') – defaults to ``

  • write_csv ('bool') – defaults to True

  • truncate_dim ('int | None') – defaults to None

  • score_functions ('dict[str, Callable[[Tensor, Tensor], Tensor]] | None') – defaults to None

  • main_score_function ('str | SimilarityFunction | None') – defaults to None

  • query_prompt ('str | None') – defaults to None

  • query_prompt_name ('str | None') – defaults to None

  • corpus_prompt ('str | None') – defaults to None

  • corpus_prompt_name ('str | None') – defaults to None

Attributes

  • description

    Returns a human-readable description of the evaluator: BinaryClassificationEvaluator -> Binary Classification 1. Remove "Evaluator" from the class name 2. Add a space before every capital letter

Methods

call

This is called during training to evaluate the model. It returns a score for the evaluation with a higher score indicating a better result.

Args: model: the model to evaluate output_path: path where predictions and metrics are written to epoch: the epoch where the evaluation takes place. This is used for the file prefixes. If this is -1, then we assume evaluation on test data. steps: the steps in the current epoch at time of the evaluation. This is used for the file prefixes. If this is -1, then we assume evaluation at the end of the epoch. Returns: Either a score for the evaluation with a higher score indicating a better result, or a dictionary with scores. If the latter is chosen, then evaluator.primary_metric must be defined

Parameters

  • model ('SentenceTransformer')
  • output_path ('str') – defaults to None
  • epoch ('int') – defaults to -1
  • steps ('int') – defaults to -1
  • args
  • kwargs
compute_dcg_at_k
compute_metrices
compute_metrics
output_scores
prefix_name_to_metrics
store_metrics_in_model_card_data