Research12 minMay 15, 2026

The Lab Notebook: Best Practices for Reproducible Research

Six months from now you will need to know exactly which seed produced Figure 3, which commit ran the ablation, and why you stopped using the v2 dataset. A working lab notebook is not paperwork — it is the cheapest insurance against re-running months of work. Here is how to keep one that actually pays off.

Par Jin Park

Founder & Editorial Lead

1. What a Lab Notebook Is Actually For

A lab notebook is not a diary, not a status report for your advisor, and

not a place to write polished prose. It is the answer to one question

asked six months from now: "What exactly did past-me do, and can I redo

it?" The realistic audiences, in order: (1) future-you trying to

reproduce a result for a rebuttal, (2) the labmate inheriting your code

after you graduate, (3) the reviewer who asks "what happens if you

change X," (4) the journal demanding a data-availability statement,

(5) the IRB or funding agency during an audit, (6) you again, eight

years later, when a citing paper claims your result does not replicate.

Every entry should serve at least one of these readers. If it serves

none, do not write it; if it serves all of them, you wrote it right.

Three Questions a Good Entry Answers

What did you do today, in enough detail that a careful labmate could rerun it?
What did you observe, including the result you did not expect?
What will you do next, and why this and not something else?

2. Paper vs Electronic vs Hybrid: Pick the One You Will Actually Use

The notebook you keep daily beats the perfect one you set up and abandon

in three weeks. Three stable patterns: (1) Pure paper — bound, numbered

pages, signed and dated; still common in wet labs and required by some

institutions for IP reasons; offline, distraction-free, but impossible

to grep. (2) Pure electronic — Obsidian, Notion, LabArchives, Benchling,

or plain Markdown in a Git repo; searchable, syncable, embeds figures

and code, but easy to fragment across tools. (3) Hybrid — a paper

notebook for raw observations at the bench or whiteboard, scanned and

indexed into an electronic one within 24 hours; gives you the speed of

handwriting and the searchability of text. Whichever you pick, commit

for at least a full semester before switching; the cost of migration is

higher than the cost of a slightly worse tool.

Choosing the Right Format for Your Field

Wet lab, regulated environment: paper or institutional ELN (LabArchives, Benchling)
Computational/ML: Markdown in Git + experiment tracker (MLflow, W&B, Aim)
Theory or math: paper for derivations, LaTeX or Obsidian for typed notes
Mixed methods: hybrid, with a weekly 30-minute scan-and-index session
Required by IP/regulatory: whatever your office of research says — non-negotiable

3. The One-Entry-Per-Experiment Rule

Resist the urge to keep one rolling "today" log. Create one entry per

experiment, indexed by date and a short stable ID (`2026-05-15-ablation-dropout`,

`EXP-0042`). Each entry has the same skeleton: hypothesis in one sentence,

method as a short bulleted protocol or a link to the exact commit hash,

what you actually ran (commands, parameters, seeds), what you observed

(numbers, plots, anomalies), and what you concluded. Keep failed

experiments — they are the most valuable entries six months later when

a reviewer asks "did you try X?". A negative result you logged saves

two weeks of re-running; a negative result you remember vaguely is

worth nothing.

Entry Template That Survives Contact With Reality

Title + date + stable ID (use the same ID in filenames, commits, run names)
Hypothesis: one sentence — what you expect and why
Method: link to commit, list parameter changes from the previous run
Run: exact command, seed, hardware, wall-clock time
Result: numbers in a table, screenshot of the key plot, raw log path
Interpretation: one paragraph, including what surprised you
Next step: the single concrete action this entry implies

4. Versioning, Seeds, and the Provenance Trail

Three pieces of information rescue most "I cannot reproduce my own

result" emergencies: the exact code commit, the exact data version, and

the exact random seed. Make all three impossible to lose. Tag the commit

hash in the entry (`git log -1 --format=%H` pasted in). Pin the dataset

with a content hash or a versioned reference (DVC, Git-LFS, S3 versioning,

or a frozen tarball with a SHA-256 in the entry). Set the seed

explicitly in code and write it down — "default" is not a seed. For

ML experiments, also log the CUDA version, library versions

(`pip freeze > env.txt`), and hardware (`nvidia-smi`); a number of

"irreproducible" papers are reproducible if you match the CUDA minor

version. The whole provenance triple takes 30 seconds to capture and

saves weeks.

5. Plots, Tables, and Raw Data Stay Together

A finding in your notebook should let you click through to the raw data

and the script that made the figure, without hunting. Standard layout:

one directory per experiment ID, containing `run.sh` (the exact command),

`config.yaml` (parameters), `logs/` (stdout/stderr), `outputs/` (model

checkpoints, raw measurements), `figures/` (plot images and the

notebook that generated them), and a `README.md` (the notebook entry

itself). The notebook entry links into this directory. Never copy a

number from a plot into prose without also linking the source file —

next year, when the plot looks wrong, you need to find the data in

under one minute, not one afternoon. For wet-lab notebooks, the

equivalent is: tape the printed gel/blot/sequence trace into the

notebook page, with the file path of the raw scan written next to it.

Directory Layout That Stays Manageable Through Year Five

One top-level folder per project, one subfolder per experiment ID
Use ISO dates (2026-05-15) and zero-padded numbers (EXP-0042) for sortable filenames
Never put data inside the code repo — symlink or reference by path
Keep raw data read-only at the OS level; you cannot accidentally overwrite chmod -w
Back up the project folder weekly to a second location, not just iCloud

6. Weekly Index, Monthly Review, Quarterly Audit

A notebook with 400 entries and no index is write-only. Three review

cadences fix this. Weekly (15 minutes, end of Friday): write a 5-line

summary of the week's experiments at the top of a `weekly/` page, link

to the entries, and flag any results that need follow-up. Monthly

(30 minutes): re-read the weekly summaries, write a one-paragraph

"state of the project" — what is the current best result, what is the

next blocker, what experiments are dead branches. Quarterly (1 hour):

pick three random old entries and try to reproduce them from the

notebook alone, without your memory; gaps you find tell you what the

template is missing. The reproduction audit is the single highest-value

hour in your research quarter; it catches drift before it costs you a

paper.

7. Mistakes to Stop Making This Week

Six anti-patterns that ruin otherwise-good notebooks. (1) "I will write

it up later" — by the time later arrives, the numbers and the rationale

are gone; write the entry the day you run the experiment. (2) Editing

old entries to make them look smarter — instead, add a dated

"addendum" so the history is preserved. (3) Storing only the final

plot — keep the intermediate ones; the rejected plot is often the

paper's most valuable figure. (4) Sharing the notebook only via Slack

screenshots — your advisor cannot search Slack a year later; link the

entry. (5) Writing for an audience that does not exist (a hypothetical

tenure committee) — write for the reviewer who will ask one specific

question. (6) Skipping the entry on a "small" experiment — most papers

are won or lost by which small experiments you remember a year later.

À propos de l'auteur

Jin Park