Gemini + Drive video integration

 
 

CONTEXT

Video is strategic to the long-term success of Google Workspace. However, the video files type wasn’t supported in the Gemini side panel due to competing priorities.

Additionally, part of Drive’s 2025 strategy is to improve the consumption experience for 3rd party file types, including of video.

 

OPPORTUNITY

I secured a high-visibility commitment for Gemini + Meet Recording by creating concepts for a Gemini + Meet Recording narrative for Sundar’s demo at Google I/O Keynote–2.1M views. The demo accelerate Gemini support for videos in Workspace.

 

CONSTRAINT

90-day I/O launch commitment.

 

PROBLEM

Videos take too much time to consume. 
Imagine the scenario: You missed a training session. It's an hour long, even at 2x speed it's still 30 minutes. You're busy, so you skip it. 

 

EXISTING RESEARCH

There are two primary video use cases, creation & consumption. Small and Medium-sized Businesses (SMBs) often create videos, while Mid-Market and Enterprise users mainly watch them. The types of videos these users typically watch include: 

  • Company announcements / presentations / team meetings

  • Webinars / trainings lectures / tutorials 

Additionally, there are four core archetypes whose willingness to pay and GenAI impact is largest: Sales, Support, Coordinate, IT / Security. Video can add significant value for Mid-Market and Enterprise users, especially in Sales, Support, and Coordination, as shown by competitive solutions.  

Our target audience is SMBs, Mid-Market & Enterprise companies with a focus on Sales, Support & Coordination roles.

 

RESEARCH

During a UX/PM workshop, we identified a list of primary "jobs-to-be-done" for small and medium-sized businesses (SMBs), mid-market, and enterprise companies, focusing on sales, support, and coordination roles.

This process resulted in four prominent themes, derived from an extensive collection of video types. The data was systematically gathered from past HaTS surveys, dedicated diary studies, and a literature review. This approach ensures that our definition of "jobs-to-be-done" is robust and accurately reflects the needs of these critical business functions.

UXR took those jobs and asked users to prioritize them based on frequency and tedium.

The Maxdiff result ranked the top jobs, which we used for starter tiles, prompt engineering, and quality evals.

 

MY APPROACH

To meet our 90-day launch commitment, I utilize existing Worskpace Gemini AI design patterns to deliver a spec in two weeks.

DESIGN EXPLORATION

I explored CUJs:

  • Opening meet recording throughout Workspace

  • Starter tiles to help users understand what questions to ask

  • Question and answer

  • Citations to reference the source

 

PROTOYPE AND EVALUATE

As our engineering team started development, I continued to assess and refine. I created a prototype for usability evaluation. The main objectives were to understand the overall experience, starter tiles, and citations.

Findings

  1. Participants appreciated the default summary, but felt it lacked specific important meeting details such as the list of attendees, time, date, etc.

  2. The ‘outline the key takeaways’ and ‘list action items’ starter tiles worked as expected, however the ‘Explain something said’ tile was vague and lacked clear expectations.

  3. The automatic generation of text in the prompt box upon clicking a starter tile was initially unexpected and an unnecessary step – most preferred if the response was immediately generated.


ITERATE

As a results,

  • We improved the quality and extended the lenght of the summary.

  • We ultimately remove “Explained something said,” starter tile.

  • I worked with the Workspace Platform AI team to ensure clicking a starter tile would generate a response instead of the place it the prompt in the input

 

LAUNCH & VALIDATE

We launched Gemini in Video May 2025. The initial GA landing report indicated:

Engagement

  • 0.01% of Product Engaged user days are Vidkick engaged
    This is low

  • 0.09% engagement
    Benchmarking – Gemini in PDF is 0.9% (10x), however PDFs better feature discoverability

  • 89% of video consumption events don’t watch the video – users aren’t watching the video to consume the information

Retention

  • On a 7d basis, users using Vidkick 1.18 days, showing that retention is low

 

WHAT’S NEXT

  • Feature discoverability

  • Incomplete transcript

  • Quality

  • Speaker attribution

  • Additional internationalization support

  • Citation cards