Photo credit: venturebeat.com
Midjourney, a rapidly growing startup recognized by many enthusiasts as a leader in AI image generation since its 2022 launch, has unveiled its highly anticipated version 7 of the image generation model.
The standout innovation in this release is a new method allowing users to create images using voice input.
Previously, individuals were restricted to text prompts and could enhance their guidance through additional images. Now, users can simply speak to Midjourney’s alpha website (alpha.midjourney.com), provided their device is equipped with a microphone, letting the model listen to audio descriptions and generate corresponding images.
It remains to be seen if Midjourney has developed its own speech-to-text technology for this feature or if it has integrated solutions from established providers like ElevenLabs or OpenAI. Inquiries directed at Midjourney founder David Holz for clarification have not yet received a response.
Utilizing Draft Mode and Conversational Voice Input
Complementing this voice input functionality is the introduction of a “Draft Mode,” designed to accelerate image generation compared to the previous version, v6.1, with creations often completed in under a minute and sometimes as quickly as 30 seconds.
While the initial output quality may not match that of v6.1, users have the option to enhance or modify drafts by selecting “enhance” or “vary” buttons next to the generated images.
The intention is to facilitate a smoother creative experience, allowing users to express their ideas verbally rather than meticulously drafting text prompts. This approach encourages real-time interaction with new generations, prompting users to provide feedback through audio to refine their artistic vision—for example, by instructing the model to adjust lighting, detail, or vibrancy.
Getting Started with Midjourney v7
The first step to accessing the new features, including Draft Mode, involves creating a personalized style through Midjourney’s updated personalization feature.
This feature, initially introduced in Midjourney v6 in June 2024, previously allowed users to set preferences by evaluating image pairs. The new version requires users to establish a personalized style specifically for v7 prior to its first use.
Following this setup, users can access the familiar Midjourney Alpha dashboard, click “Create,” and then enable personalization mode through the new “P” button in the prompt section.
In previous versions, styles from v6 could be selected, but users should note that moodboards—image collections uploaded by users—are currently unavailable, although Midjourney has indicated plans to reintegrate this feature shortly.
Once in the creation phase, users can activate Draft Mode by selecting the corresponding button, which will illuminate in orange to indicate it is active, alongside a microphone icon for voice input.
Upon engaging the voice input mode, users can speak naturally, prompting Midjourney to generate images based on their descriptions. Technical difficulties, such as connection errors, may occur but generally resolve through a simple restart of the voice mode or refreshing the webpage.
As users articulate commands, Midjourney will display keyword suggestions and generate an accompanying prompt while rendering a new set of images based on the spoken input.
Introducing New Features Amid Existing Limitations
With the release of Midjourney v7, users can operate in two modes: Turbo, which offers enhanced performance at double the cost of a standard v6 job, and Draft Mode, which is priced at half that rate (per job). A standard-speed mode is under development for future release.
Currently, certain functionalities like upscaling, inpainting, and retexturing rely on the previous v6 model, but Midjourney aims to transition these capabilities to v7 through forthcoming updates.
The company has committed to a robust update schedule, planning improvements every one to two weeks, including the introduction of a novel character and object reference system aimed at enhancing user experience.
Midjourney emphasizes that v7 represents a distinct model with unique advantages and challenges, inviting users to explore various prompt techniques and provide feedback to facilitate further refinement.
Mixed Initial Reactions Compared to Prior Versions
The response to Midjourney v7 has been notably varied, diverging from the overwhelming enthusiasm typically greeted by earlier releases.
Despite the designation of v7 as an “alpha” version in their official blog and on social media, many users anticipated a significant leap in image quality, adherence to user prompts, and a better comprehension of human anatomy—particularly in generating accurate hand representations.
Comments on social media reflect some disappointment in the updates, with users expressing a desire for greater advancements. For instance, @freiboitar wrote expressing dissatisfaction, stating that while v7 appears more realistic, it does not represent a substantial improvement over earlier versions.
Others echoed similar sentiments regarding the incremental nature of v7’s updates. In contrast, some users have expressed satisfaction with their initial outputs, noting improved artistic elements and image quality.
As the release remains fresh, user feedback may continue to shift, with opinions potentially evolving as more people experiment with the new model and its features. Midjourney v7 is currently accessible to all existing users for exploration.
Source
venturebeat.com