nikolasbakalis.com

Nikolas Bakalis

Personal work, research, and technical writing

A focused archive for backend systems work, data infrastructure benchmarks, ML/NLP research, and technical writing that is useful as context for collaborations, applications, and deeper engineering conversations.

Email GitHub LinkedIn Resume

Data systems ML/NLP Backend engineering Research notes

Selected research

Reports and artifacts

Research 2026-05-25

Translation Model Benchmark for Multilingual Video Transcripts

A multilingual benchmark comparing Google Translate, DeepL, and Llama Maverick 4 on noisy video transcript data across 15 languages.

Translation
LLMs
Evaluation
Transcripts

Published

Read research

Research 2026-05-22

RDS vs StarRocks 20M Serving and Aggregation Benchmark

A 20 million row benchmark comparing RDS/Postgres serving tables with StarRocks OLAP tables and async materialized views.

RDS
StarRocks
OLAP
Benchmark

Published

Read research

Research 2026-05-21

Materialized View API Serving Benchmark

An API-level comparison of denormalized RDS tables and StarRocks async materialized views across 100k, 1m, and 10m row scales.

RDS
StarRocks
API
Materialized views

Published

Read research

Research 2026-03-20

IAB 3.0 Content Classifier Training Report

A training report for an in-house hierarchical IAB 3.0 content classifier built to replace external classification dependencies.

IAB 3.0
Classification
NLP
Training

Published

Read research

Research 2026-03-16

API Framework Benchmark for Data-Intensive Services

A benchmark of API frameworks and datastore access patterns for data-intensive services spanning PostgreSQL, StarRocks, and OpenSearch.

API
Backend
PostgreSQL
OpenSearch

Published

Read research

Research 2026-03

IAB 3.0 Classification and Language Detection Pipeline Proposal

A project proposal for replacing external classification APIs with in-house IAB 3.0 classification and language detection pipelines at media-corpus scale.

IAB 3.0
Language detection
Classification
Pipeline

Published

Read research

Tools

Selected technologies

The stack changes by project, but the recurring work is around reliable data paths, measurable model behavior, and production API surfaces.

AWS
PostgreSQL
StarRocks
OpenSearch
Iceberg
Python
TypeScript
Go
Bun
Docker
FastAPI
Fastify
Celery
NLP
LLM evaluation

Research

Dated work that can be cited

Research 2026-05-25

Translation Model Benchmark for Multilingual Video Transcripts

A multilingual benchmark comparing Google Translate, DeepL, and Llama Maverick 4 on noisy video transcript data across 15 languages.

Translation
LLMs
Evaluation
Transcripts

Published

Read research

Report index

Backend performance benchmarks
Database serving architecture
ML and NLP classification systems
Translation model evaluation

New York, NY

About this site

I work on data-intensive backend systems, evaluation workflows, and ML/NLP pipelines where correctness, performance, and operational tradeoffs matter. This site collects dated reports and implementation notes so the work can be reviewed directly instead of reduced to a resume bullet.