Ernst & Young (EY) Foundry: ConvoCoach

A scalable artificial intelligence (AI)-powered platform providing simulated role-play scenarios, utilizing a unique AI approach to offer realistic conversational skills practice in a judgement-free environment.

eyconvocoach.ey.com

Duration: 4 months

Responsibilities: Lead collaborator (team of 6 & EY’s team), survey analysis, UI design, technical writing, project management.

Project type: UX Research

Project Deliverable

Provide EY with findings and recommendations for ConvoCoach related to verbal conversational skills — what users say and the meaning of their words.

Research

We investigated the market landscape by looking at competitors and similar platforms like Grammarly, Microsoft Presenter Coach, Orai, Ummo,Metronome Beats, and Duolingo to discover features and information-sharing techniques. We created an interview guide based on user group to be interviewed and asked open-ended question such as things interviewees found distracting in work-related conversations, ranking the importance of the features, and more specific questions about how each feature would impact them in their interactions. 

Findings: Shortlisting Features

In our onboarding meeting with EY, we discussed important features that were valuable to the development of ConvoCoach. These features included: tone, filler words, swearing, originality, pauses & pace and readability. The features are to give ConvoCoach metrics to analyze the user’s speech and professionalism in verbal conversations.

The primary features that interviewees and EY thought were most important were originality, tone and pauses & pace.

We eliminated swearing and filler words based on interviewee’s insight that revealed it’s professionally not expected to use swear words in conversations, but they can be a good indicator of the emotional state of the user in the conversation. Similar to filler words, implementing feedback on this metric will likely require not much else beyond flagging.

Readability was also eliminated based on feedback that ConvoCoach provides   key points of conversation before the user starts a module so they are able to give the simulated conversation context.

Recommendation

We created prototypes to visualize how these features can be assessed to provide users detailed feedback to improve verbal conversations with colleagues and clients.

These prototypes integrates EY’s existing design with our interface design for the features.

Features include: pauses & pace, tone (pitch & volume) and originality.

Pauses & Pace

The purple bars indicate pauses along the timeline.The length of the bars indicate how long the user pauses during the length of the conversation.

The yellow bars indicate areas in the conversation where the speaker was either too fast or too slow.

The (i) icon reveals how much faster/slower the user was speaking compared to what was considered the desired range by providing the average pace over time in words per minutes (wpm).

Tech Implementation
The speaker’s words per minute is recorded and measured to determine if the user’s speech is well-paced. Pauses will also need to be measured and determined if they are too long or too short based on the context of the conversation.

Tone

The line layered over the waveform shows the user their average pitch/volume throughout the response.

A corresponding progress bar that matches the video audio to allow the user to receive real-time feedback

Subtle tips in places of concern in the user’s recorded timeline are revealed when the yellow flag pins are hovered on.

We recommend the metrics are assessed referencing the user’s baseline tone in order to accurately analyze the variations.

Tech Implementation
The voice of the users will be captured and measured to create averages based on the overall soundbite. It will be useful to define boundaries based on what’s determined to be the most appropriate for a situation/scenario.

Originality

We recommend an achievement badge to motivate originality. This recommendation can be applied to other metrics to motivate users and incentive practicing their verbal skills.

Tech Implementation
The system will need to compare the user’s speech transcripts with the expert answer and the user’s previous responses to determine originality and long-term retention of the modules.

Other considerations

Features are context-dependent.
It is important to factor in the ethical implications of giving specific feedback on speaking patterns.

Users may naturally have variations in the way they speak due to cultural background, spoken languages, accents, dialect, neurodivergence, and personal abilities.

It’s important to further develop the program to be inclusive of such variability.

Next
Next

Can I Eat This? (Food Restriction Mobile App)