Can large language models replace manual coding of open-end survey responses without sacrificing quality? We outline a four-phase approach: (1) testing AI-generated codeframes against human baselines; (2) evaluating assignment accuracy via interrater reliability and agreement subsets; (3) weighing cost-benefits of additional training data; and (4) sharing lessons from operationalizing at KS&R. Attendees will leave with methods for quantitatively assessing AI�s utility in open-end coding and practical tools to begin experimenting within their own organizations.