Android Speech Recognition Tutorial

On-Device Face Recognition In Android

2025-12: Add new FaceNet models with known sources, enable MLKit for face detection and precise NN-search 2024-09: Add face-spoof detection which uses FASNet from ...

GitHub

Speech To Speech: Build voice agents with open-source models

This starts an OpenAI Realtime-compatible server at ws://localhost:8765/v1/realtime using Parakeet TDT for local STT, an OpenAI-compatible LLM, and Qwen3-TTS for ...

IEEE

Multitask Transformer for Cross-Corpus Speech Emotion Recognition

Abstract: Deep learning has significantly advanced the field of Speech Emotion Recognition (SER), yet its efficacy in cross-corpus scenarios remains a challenge. To overcome this limitation, recent ...

Android Authority

The Gemini app's mic just got a major upgrade for multilingual users

The Gemini app’s mic now supports inputs in over 70 languages. You can mix different languages as well, and you don’t need to change any language settings. The feature is available on Android and iOS, ...

Android

Meta got caught quietly shipping facial recognition to Ray-Ban glasses – and pulled it just as quietly

Meta secretly embedded facial recognition code – internally called NameTag – into the Meta AI app used to pair its Ray-Ban smart glasses, shipping it to over 50 million phones without telling anyone.

Twitter

Gemini 3.5 Live Translate brings real-time speech translation, 70+ language support, and Android listening mode

Google has announced Gemini 3.5 Live Translate, its latest AI-powered speech translation model designed to enable natural, real-time multilingual communication. Built on Google’s translation ...

FoneArena

Google rolls out Gemini 3.5 Live Translate with real-time speech translation, 70+ language support, and Android listening mode

Google has introduced Gemini 3.5 Live Translate, a new audio model designed for real-time speech-to-speech translation. The system builds on two decades of machine learning work in translation and is ...

IEEE

Keyword Guided Target Speech Recognition

Abstract: This letter presents a new target speech recognition problem, where the target speech is defined by a keyword. For instance, when a person speaks “Hey Google” or “Help Me”, we hope the model ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results