A TF-IDF feature pipeline paired with a gradient-boosted tree model satisfies every stated constraint. TF-IDF vectors can be built inside or near the SQL warehouse, and gradient-boosted ensembles score quickly on commodity CPUs while exposing feature-importance values or SHAP explanations to meet the transparency demand.
The transformer fine-tuning alternative requires GPU hardware for practical sub-second inference and remains difficult to justify under regulatory explainability duties. A rules-engine built from hundreds of regular expressions is brittle, hard to maintain at this scale, and offers little statistical insight into feature impact. Unsupervised k-means clustering does not learn the six predefined categories and would still require a manual mapping layer, violating the automatic routing requirement.
Therefore the gradient-boosted tree classifier with TF-IDF inputs is the most appropriate match to the gathered business needs.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is TF-IDF, and why is it used in text classification?
Open an interactive chat with Bash
How do gradient-boosted tree models work, and why are they effective?
Open an interactive chat with Bash
What are SHAP explanations, and how do they help with compliance in machine learning?