Hi, I’m Marius đź‘‹

I’m an AI Platform Architect based in Romania, currently working at IBM on enterprise-scale AI solutions. I specialize in building intelligent systems that bridge the gap between cutting-edge AI and production-ready infrastructure.

What I Do

My work sits at the intersection of AI, cloud architecture, and platform engineering. I design and implement solutions using:

  • AI/ML Platforms: watsonx.ai, Openshift AI, NVIDIA GPUs, LangChain/LangGraph for agentic AI systems, AI governance frameworks
  • Cloud & Infrastructure: OpenShift/ROKS, Kubernetes, AWS, GCP, hybrid cloud architectures
  • DevOps & Automation: CI/CD pipelines, GitOps, Terraform, Infrastructure as code

I’ve had the privilege of working across diverse industries including advertising, automotive, energy utilities, and telecommunications, building everything from Patent Analyzer systems to prosumer advisor platforms using agentic AI.

NVIDIA & AI Infrastructure

I have a deep interest in NVIDIA AI infrastructure and the technologies powering modern AI datacenters. My expertise includes:

  • NVIDIA AI Enterprise Stack: GPU Operators, NVIDIA Base Command Manager, container toolkits
  • High-Performance Networking: InfiniBand architecture, NVIDIA Quantum switches, RDMA, GPUDirect technologies
  • NVIDIA Spectrum-X: Ethernet-based AI networking for scale-out GPU clusters
  • DGX Systems & SuperPOD: Architecture patterns for large-scale AI training infrastructure

I’m particularly fascinated by the convergence of high-performance computing and AI workloads, and how technologies like InfiniBand and NVLink enable the massive GPU clusters that power today’s foundation models.

High-Performance Computing (HPC)

Beyond AI-specific infrastructure, I have strong interest and knowledge in traditional HPC technologies that increasingly underpin AI supercomputing:

  • Workload Management: Slurm, PBS, job scheduling and resource allocation for large-scale clusters
  • Parallel Filesystems: Lustre, IBM Storage Scale, BeeGFS — high-throughput storage for distributed workloads
  • Cluster Management: Provisioning, monitoring, and maintaining bare-metal HPC clusters
  • MPI & Distributed Computing: Message passing patterns, collective operations, multi-node training orchestration

The convergence of HPC and AI is creating fascinating challenges — from scheduling mixed workloads to optimizing data pipelines for training at scale.

My Journey

Before diving deep into AI, I built my foundation in cloud architecture and DevOps across organizations like Vodafone, DoiT International, Verne Global, and PaddyPower Betfair. This background gives me a practical perspective on deploying AI systems that actually work in production environments.

Beyond Work

I maintain a home lab where I experiment with the latest in AI infrastructure—currently running OpenShift on an HP Z840 workstation with Nvidia RTX 5070 Ti, for hands-on exploration of NVIDIA tooling, OpenShift Virtualization, and emerging AI platforms.

Let’s Connect

Feel free to reach out if you want to discuss AI architecture, cloud platforms, or anything tech-related.