If you already know what a character should sound like, text prompting is only half the job. The real production need is simpler: take a voice sample you trust and turn it into that character's working voice inside the project. Midsummerr now supports that on hybrid-mode projects with a new Clone a Voice flow inside the character modal.
That is the actual update. This is not a vague "voice cloning is coming" announcement. It shipped as a character-level workflow on the voices page: open a character, switch to the new tab, record or upload a sample, confirm you have the rights to use it, and clone that voice directly into the project.
What shipped
The new flow lives inside the same modal where you already edit and regenerate character voices. Alongside Edit & Regenerate and Previous Voices, hybrid projects now show a third tab: Clone a Voice.
From there, the shipped workflow does four concrete things:
- It lets you record or upload a voice sample for that character.
- It gives clear sample-quality guidance before you submit anything: one speaker, quiet room, no music, and roughly 30 to 60 seconds of clean speech.
- It requires an explicit rights-and-consent confirmation before the clone runs.
- It plays the cloned sample back in the modal once the new voice is ready.
That matters because it turns custom voice work into part of the same production surface instead of an off-platform workaround. You do not need a separate handoff just to test whether a voice reference actually belongs on the page.
Ready to try it yourself?
Create your first audiobook free →Why this is more useful than another text-only voice prompt
Text prompts are good at direction. They are less good at preserving a voice identity you already have in mind.
Sometimes the problem is not "make this character younger" or "make the line warmer." The problem is that you already have a specific performance reference, actor sample, or narrator texture you want the character to inherit. Before this update, that kind of adjustment meant approximating the result indirectly. Now the workflow is more direct: bring in the sample, run the clone, and judge the result with your ears.
For teams producing dialogue-heavy scenes, that changes the speed of iteration. You can move from abstract voice notes to a concrete per-character sample without leaving the project workflow.
The scope is intentionally narrow
This launch is useful because it is specific, not because it tries to do everything at once.
The shipped feature is:
- Per character, not a project-wide voice replacement
- Inside the existing voices modal, not a separate dashboard flow
- Available on hybrid-mode projects, because that is the voice path currently wired for cloned voices
That narrow scope is the right one for production work. Character voice decisions usually happen one role at a time. You do not need a global "clone everything" button when the creative job is deciding whether this voice belongs on this character.
The rights gate matters as much as the clone itself
Voice cloning features get messy when the product pretends consent is an afterthought. This flow does the opposite. Before the user can submit the sample, the modal requires an explicit confirmation that they have the rights to use the voice and any consent required for AI-generated audiobook output.
That is the right production framing. A custom voice is not just a sound-design decision. It is also a rights decision. If a team is going to use cloned voices in a commercial audiobook workflow, that responsibility needs to be stated at the point of use, not buried elsewhere.
What this changes for real productions
The simplest use case is also the most common one: you have a role that should sound like a real person you already have access to, and you want to test that voice inside the actual cast instead of describing it from scratch.
For authors and producers, that makes a few workflows easier:
- keeping a known voice reference attached to one important role
- trying a more specific lead-character voice without rebuilding the rest of the cast
- comparing a cloned custom option against regenerated synthetic options in the same modal history
It also fits the broader Midsummerr production model. We already let teams cast voices, refine pronunciations, shape dialogue, and iterate chapter by chapter. This update makes the "what should this character sound like?" step more direct when the answer already exists in audio form.
If you want the broader production paths around that workflow, the pricing page shows the current tiers and the features page shows the rest of the production surface. If you want to judge the finished output first, start with Jane Eyre, Frankenstein, or Alice in Wonderland.
FAQ
Where does Clone a Voice live in Midsummerr?
It lives inside the character voice modal on the voices page. On hybrid-mode projects, the modal now includes a Clone a Voice tab next to the existing edit/regenerate and history views.
Can I record a sample, or do I need to upload a file?
You can do either. The shipped flow supports recording a sample directly in the UI or uploading an audio file, then using that sample as the source for the character voice.
Is this a project-wide voice import?
No. The shipped feature is character-level. You clone a voice for one character at a time inside that character's modal.
Does the flow include a consent check?
Yes. Before the clone can run, the user must explicitly confirm that they have the rights to use the voice and any required consent for AI-generated audiobook output.




