A Guardrail for the Hardest Conversations: (Bilingual) Youth Crisis Detection | Montreal .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

June 17, 2026 · Montreal

Bilingual Youth Crisis Detection Guardrail

This talk presents a guardrail for youth mental health conversations, detecting gradual crises using a two-stage system. Learn how a novel LLM prompting technique improves crisis detection accuracy.

Overview
Links
Tech stack
  • mmBERT
    mmBERT is an open-source, massively multilingual encoder-only language model trained on 3 trillion tokens across 1,833 languages.
    Developed by Johns Hopkins University, mmBERT updates the aging XLM-RoBERTa architecture by bringing modern transformer optimizations to encoder-only models (1.1.1, 1.2.6). Built on the high-performance ModernBERT architecture, it delivers 2 to 4 times faster inference speeds and natively supports an expanded 8,192-token context window (1.1.1, 1.2.4). The core innovation is its annealed language learning training strategy: a three-phase schedule that prevents overfitting on high-resource languages and ensures robust representation for low-resource languages (1.1.1, 1.1.7). This approach makes mmBERT a highly efficient, production-ready standard for multilingual classification, retrieval, and semantic search (1.1.1, 1.2.5).
  • Cohere c4ai-command-a-03-2025
    Cohere's 111-billion-parameter open-weights model built for high-throughput enterprise tasks, advanced tool use, and multilingual operations across 23 languages.
    Developed by Cohere and Cohere For AI, Command A is a 111-billion-parameter open-weights model designed to deliver maximum performance with minimal hardware overhead (deployable on just two GPUs). It features a massive 256,000-token context length and delivers a 150% throughput boost over its predecessor, Command R+ 08-2024. Optimized for business-critical workflows, the model excels at agentic tasks, multi-step tool use, and retrieval-augmented generation (RAG) with built-in document citation, making it a highly efficient, secure choice for demanding enterprise environments.
  • PyTorch
    PyTorch is the open-source machine learning framework: it provides a Python-first tensor library with strong GPU acceleration and a dynamic computation graph for building deep neural networks.
    PyTorch, developed by Meta AI, is a premier open-source deep learning framework favored in both research and production environments. Its core is a powerful tensor library (like NumPy) optimized for GPU acceleration, delivering 50x or greater speedups for complex computations. The key differentiator is its 'Pythonic' design and dynamic computation graph (eager execution), which allows for rapid prototyping and simplified debugging compared to static-graph frameworks. Leveraging its Autograd system for automatic differentiation, practitioners build and train models for computer vision and NLP; major companies like Tesla (Autopilot) and Microsoft utilize PyTorch for critical AI applications.
  • Hugging Face Transformers
    The Hugging Face Transformers library is the premier open-source Python toolkit, providing a unified API for over 1M+ state-of-the-art pre-trained models (like BERT, GPT-3, T5) across NLP, vision, and audio tasks.
    Hugging Face Transformers is the essential open-source Python library for democratizing state-of-the-art machine learning. It delivers a unified, framework-agnostic API (PyTorch, TensorFlow) for accessing and utilizing over 1M+ pre-trained model checkpoints, including industry standards like BERT, GPT-2, and T5. Developers leverage the high-level `Pipeline` class for rapid, optimized inference (e.g., text generation, sentiment analysis) and the `Trainer` class for efficient fine-tuning and distributed training. This core library connects the ML community to the vast Hugging Face Hub, accelerating the deployment of models across text, vision, and audio modalities with minimal code.
  • gpt-oss-120b
    OpenAI's open-weight Mixture-of-Experts model built for local, high-reasoning production workloads on a single 80GB GPU.
    Released by OpenAI, gpt-oss-120b is a 117-billion parameter open-weight language model designed to deliver advanced reasoning and agentic capabilities directly on local infrastructure. By leveraging a Mixture-of-Experts (MoE) architecture, the model activates only 5.1 billion parameters per token, allowing developers to run high-performance workloads efficiently on a single 80GB GPU (such as an NVIDIA H100 or AMD MI300X). It features a 131,072-token context window and integrates natively with Hugging Face Transformers and Ollama, making it a highly accessible option for complex instruction following and tool-use tasks without the cloud overhead.