Case Study 002

Mobi: Microservices

View on GitHub

Microservices Architecture

Mobi is a complex image analysis platform. To handle heavy computational loads (like SAM segmentation) without blocking the UI, we use an asynchronous microservices architecture.

The system is containerized with Docker, orchestrated via Docker Compose, and relies on a message broker pattern.

Containerization Strategy

Each service runs in its own isolated container. We use multi-stage builds to keep image sizes small (e.g., stripping build tools from the final Python image).

# Dockerfile.backend

FROM python:3.10-slim as builder

WORKDIR /app

COPY requirements.txt .

RUN pip install --user -r requirements.txt


# Final Stage

FROM python:3.10-slim

COPY --from=builder /root/.local /root/.local

ENV PATH=/root/.local/bin:$PATH

Async Task Queue

When a user requests a segmentation, the FastAPI backend doesn't process it immediately. Instead, it pushes a job to Redis.

Specialized Celery Workers (CPU for logic, GPU for inference) pick up these jobs. This decouples the HTTP request/response cycle from heavy processing.

# task_queue/worker.py

@celery_app.task(bind=True)

def process_image(self, image_id):

  logger.info(f"Processing {image_id}...")

  # Heavy lifting here

  result = model.predict(image_id)

  return result

SQL Proficiency

We use PostgreSQL for structured data. Complex spatial queries are optimized using indices.

-- Analyze user activity

SELECT u.email, COUNT(m.id) as measurement_count

FROM users u

JOIN images i ON u.id = i.user_id

JOIN measurements m ON i.id = m.image_id

WHERE m.created_at > NOW() - INTERVAL '7 days'

GROUP BY u.id

ORDER BY measurement_count DESC;

SAM/ML Pipeline

Mobi uses Meta's Segment Anything Model (SAM) for AI-powered image segmentation. Running inference on a GPU is expensive — the architecture separates embedding generation from prediction and caches embeddings in Redis so repeated interactions on the same image do not recompute from scratch.

When a user uploads an image, a Celery GPU worker loads SAM using sam_model_registry, a registry pattern that maps model variant names (e.g., "vit_h") to constructor functions. The model loads once per worker and is cached globally — subsequent calls return the cached predictor immediately.

# gpu_sam_tasks.py — model loading (lines 43–74)

def get_sam_predictor_on_gpu() -> SamPredictor:

  global _gpu_sam_model, _gpu_sam_predictor, _gpu_is_model_loaded


  if _gpu_is_model_loaded and _gpu_sam_predictor:

    return _gpu_sam_predictor


  sam_model_type = settings.SAM_MODEL_TYPE  # e.g., "vit_h"

  _gpu_sam_model = sam_model_registry[sam_model_type](checkpoint=sam_checkpoint_abs)

  _gpu_sam_model.to(device="cuda")

  _gpu_sam_predictor = SamPredictor(_gpu_sam_model)

  _gpu_is_model_loaded = True

  return _gpu_sam_predictor

After the model is ready, a second Celery task generates image embeddings. predictor.set_image() converts the image into a high-dimensional feature representation. These features (predictor.features) are serialized and written to Redis with a 1-hour TTL. The main service polls Redis until embeddings appear before dispatching the prediction task.

# gpu_sam_tasks.py — embedding generation (lines 81–144)

@celery_app.task(name="gpu_sam_tasks.generate_embedding_gpu_task", ignore_result=True)

def generate_embedding_gpu_task(image_id: str, image_path: str):

  predictor = get_sam_predictor_on_gpu()

  image_rgb = cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB)

  predictor.set_image(image_rgb)


  embedding_data = {

    "features_serialized": _serialize_tensor(predictor.features),

    "original_size_hw": predictor.original_size,

    "input_size_hw": predictor.input_size,

  }

  sam_service.cache_image_embedding(

    image_id=image_id, embedding_data=embedding_data, expiration_seconds=3600

  )

Real-time Notifications

Rather than polling the server for task completion, Mobi uses WebSocket connections to push notifications to the browser the moment a background job finishes. The scope is intentionally narrow: the WebSocket channel carries notification events only, not live data streams.

ConnectionManager maintains a registry of active WebSocket connections keyed by user ID. A Dict[str, List[WebSocket]] structure allows a single user to have multiple simultaneous connections (e.g., multiple browser tabs) — each tab receives the notification. An asyncio.Lock() prevents concurrent modification of the registry.

# connection_manager.py (lines 5–34)

class ConnectionManager:

  def __init__(self):

    self._connections: Dict[str, List[WebSocket]] = {}

    self._lock = asyncio.Lock()


  async def connect(self, user_id: str, websocket: WebSocket):

    await websocket.accept()

    async with self._lock:

      self._connections.setdefault(user_id, []).append(websocket)


  async def send_personal(self, user_id: str, data):

    async with self._lock:

      conns = self._connections.get(user_id, [])

    for ws in conns:

      await ws.send_json(data)

When a Celery task completes, NotificationService.create_and_dispatch() persists a Notification row to PostgreSQL first, then calls manager.send_personal()to fan out to all of the user's connected clients. The persistence-first pattern ensures notifications survive page refreshes — the frontend can re-fetch missed notifications on reconnect.

# notification_service.py (lines 12–20)

async def create_and_dispatch(self, notif: NotificationCreate) -> Notification:

  db_obj = crud_notification.create_notification(self.db, notif)

  await manager.send_personal(notif.user_id, {

    "id": db_obj.id,

    "message": db_obj.message,

    "severity": db_obj.severity,

    "type": db_obj.type,

    "created_at": db_obj.created_at.isoformat(),

  })

  return db_obj

Workspace UI

The Mobi frontend is a multi-panel workspace where every pane can be dragged, resized, and rearranged. Rather than building custom drag handlers, the app delegates this entirely to react-mosaic-component — a tiling window manager for React.

LayoutManager is a thin controlled wrapper. It receives layout (the current tile tree), renderTile (how to render each view), and onChange (a callback when the user repositions a panel). The underlying Mosaic component handles all hit-testing, pointer events, and animation. Panels can be resized down to 1% of available space via minimumPaneSizePercentage.

# LayoutManager.tsx (lines 13–72)

import { Mosaic } from 'react-mosaic-component'


const LayoutManager = <T extends string = ViewId>(props: LayoutManagerProps<T>) => {

  const { renderTile, layout, onChange } = props


  return (

    <Mosaic<T>

      renderTile={renderTile}

      value={layout}

      onChange={onChange}

      className="mosaic-custom-theme"

      resize={{ minimumPaneSizePercentage: 1 }}

    />

  )

}

The bottom toolbar uses Material-UI icons — pre-built accessible SVG components imported by name. Each panel type maps to a semantic icon: a ruler for the measurement tool, a clipboard for tasks, a chart for results. No hand-drawn SVGs or custom paths are needed.

# BottomBar.tsx (lines 3–5)

import RulerIcon from '@mui/icons-material/Straighten'

import TaskIcon from '@mui/icons-material/Assignment'

import ResultsIcon from '@mui/icons-material/InsertChart'


# Used in JSX as:

<RulerIcon />  <TaskIcon fontSize="small" />  <ResultsIcon />