Part 1 — Environment Setup & Installation
Last updated: 2026-01-08
This tutorial gets you from “I have Python” to “IDTrack imports and has a working local cache directory”.
Learning objectives
Install IDTrack in an isolated environment (conda or pip).
Verify your installation with a short, copy/pasteable checklist.
Understand and set
IDTRACK_LOCAL_REPO(your on-disk cache + configuration folder).Know what “success” looks like before you start building graphs.
Warning: The first real graph build (Part 3) can take time and disk space. This notebook only verifies that your environment is ready.
1.1 — Installation Guide
IDTrack is a Python package. The easiest way to avoid dependency conflicts is to use a fresh environment.
You have two common workflows:
Conda/Mamba environment (recommended if you already use conda)
pip + venv (recommended if you prefer plain Python tooling)
Either option is fine. Pick the one that matches how your lab usually manages Python.
Step 1 — Create an isolated environment
Option A: conda/mamba
Create and activate a clean environment (example uses Python 3.11):
mamba create -n idtrack python=3.11 -y
mamba activate idtrack
Tip: If you are on Apple Silicon and see HDF5/h5py issues later, installing
hdf5via conda often fixes it:
mamba install -n idtrack hdf5 -y
Option B: venv
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
Step 2 — Install IDTrack
If you are installing from PyPI:
pip install idtrack
If you are working from a cloned repository (developer install):
pip install -e .
Expected result:
import idtrackworks in Python, andidtrack.__version__prints a version string.
Step 3 — Quick environment report (safe to run)
This cell prints your Python version and tries to import IDTrack.
Expected result: If installation succeeded, you will see an IDTrack version. If it failed, you will see a helpful error message (and the notebook continues).
4
import platform
import sys
from pathlib import Path
print('Python:', sys.version.split()[0])
print('Executable:', sys.executable)
print('Platform:', platform.platform())
try:
import idtrack
print('idtrack version:', getattr(idtrack, '__version__', 'unknown'))
print('idtrack package path:', Path(idtrack.__file__).resolve())
IDTRACK_OK = True
except Exception as e:
print('idtrack import failed ->', repr(e))
print('Fix: confirm you activated the intended environment, then re-run: pip install idtrack')
IDTRACK_OK = False
Python: 3.11.12
Executable: /Users/kemalinecik/tools/apps/mamba/envs/idtrack_dev_env/bin/python
Platform: macOS-15.7.2-arm64-arm-64bit
idtrack version: 0.0.5
idtrack package path: /Users/kemalinecik/git_nosync/master_idtrack/idtrack/idtrack/__init__.py
Step 4 — Choose your local repository directory (IDTRACK_LOCAL_REPO)
IDTrack stores cached downloads, graph snapshots, and your external YAML files in one place: your local repository directory.
You can set it in your shell:
export IDTRACK_LOCAL_REPO=/path/to/idtrack_cache
In notebooks, many tutorials fall back to ./idtrack_cache if IDTRACK_LOCAL_REPO is not set.
Tip: In a real project, put this folder somewhere stable (not a temporary directory) so you reuse caches across sessions.
Step 5 — Verify local repository read/write (safe to run)
This cell creates the directory (if needed) and writes a tiny test file.
Expected result: You should see
OKand the resolved path.
5
import os
from pathlib import Path
local_repo = Path(os.environ.get('IDTRACK_LOCAL_REPO', './idtrack_cache')).resolve()
local_repo.mkdir(parents=True, exist_ok=True)
test_file = local_repo / '_idtrack_write_test.txt'
test_file.write_text('ok', encoding='utf-8')
print('Local repository:', local_repo)
print('Write test:', 'OK' if test_file.exists() else 'FAILED')
Local repository: /Users/kemalinecik/git_nosync/master_idtrack/idtrack/docs/_notebooks/idtrack_cache
Write test: OK
Step 6 — Network sanity checks (optional, but recommended)
First-time graph builds need network access to Ensembl services. This cell performs non-destructive checks:
Can we reach the Ensembl REST API?
(If IDTrack is installed) what MySQL host does IDTrack expect?
Note: Some institutions block outbound MySQL ports. IDTrack can still work via the HTTPS/FTP MySQL dumps when MySQL is unreachable (slower but functional).
6
import socket
try:
import requests
try:
r = requests.get('https://rest.ensembl.org/info/ping', headers={'Content-Type': 'application/json'}, timeout=15)
print('Ensembl REST:', r.status_code, r.text.strip()[:80])
except Exception as e:
print('Ensembl REST check failed ->', repr(e))
except Exception as e:
print('requests not available ->', repr(e))
if IDTRACK_OK:
from idtrack._db import DB
host = DB.mysql_host
ports = [3306, 5306, 3337]
for port in ports:
try:
with socket.create_connection((host, port), timeout=5):
print(f'Ensembl MySQL: OK ({host}:{port})')
except OSError as e:
print(f'Ensembl MySQL: not reachable ({host}:{port}) -> {e.__class__.__name__}')
else:
print('Skipping MySQL check (IDTrack not imported).')
Ensembl REST: 200 {"ping":1}
Ensembl MySQL: OK (ensembldb.ensembl.org:3306)
Ensembl MySQL: OK (ensembldb.ensembl.org:5306)
Ensembl MySQL: OK (ensembldb.ensembl.org:3337)
What’s next?
Part 0 (concepts):
00_idtrack_overview.ipynbPart 2 (external database configuration):
02_prepare_new_external_yaml.ipynbPart 3 (graph builds):
03_initialization_graph.ipynb
Expected milestone after Part 3: you have a cached graph snapshot on disk, and conversions become fast and reproducible.
Troubleshooting (common issues)
``ImportError: … h5py …`` (often on macOS): install
hdf5via conda, then reinstallh5py.REST works but MySQL ports fail: this is common; IDTrack will fall back to HTTPS/FTP dumps. If you want live-MySQL speed, you may need outbound ports
3306/5306(and3337only for the human GRCh37 archive).Permission errors in the cache directory: choose a different
IDTRACK_LOCAL_REPOyou can write to.Slow first run: normal. The first build populates caches; later runs reuse them.