Skip to content

Proteomics

Comparison of methods in a proteomics benchmark: method comparison, pipeline behavior, and noise analysis using ESM-2

~4.1k tokens

I wanted to test whether my proteomics pipeline can do two things at the same time: - not invent differences where none exist - detect differences where they were deliberately introduced

This dataset is ideal for that purpose, because it is a benchmark with designed true positives (spike-ins) defined by the experiment creators. Below, I show step by step how I verified that the results are trustworthy, and where the pipeline has natural limitations.