Mastering Human Portraits with Stable Diffusion AI

Trust me, using Stable Diffusion to create lifelike, realistic human images is hard. While AI-generated portraits have improved significantly, achieving high-quality, photorealistic results requires understanding prompts, fine-tuning models, and leveraging the right tools. In this article, we will explore techniques and strategies to master human portrait generation with Stable Diffusion AI.

Understanding Stable Diffusion for Portraits

Stable Diffusion is a powerful AI model for image generation, but its default outputs often lack the fine details necessary for realistic human portraits. Achieving high fidelity requires:

Prompt Engineering: Crafting detailed and structured prompts.
CFG Scale & Sampling Methods: Adjusting parameters for optimal results.
Fine-tuned Models: Using specialized models trained on high-quality human images.
ControlNet & Inpainting: Refining features with additional tools.

In this article, we will focus on prompt engineering and fine-tuning models to create lifelike human portraits, highlighting the importance of these techniques because they can significantly impact the quality of the final result. Choosing a right CFG Scale or sampling method, it may help in some scenarios, but it may not always work. Also using ControlNet and Inpainting requires advanced knowledge and skills, and they will be not covered in this article.

Let's Dive in

Parameters

Steps: 20
Sampler: Euler a
Schedule type: Karras
CFG scale: 7
Seed: -13456
Size: 512x512
Model hash: c6bbc15e32
Model: sd-v1-5-inpainting
Conditional mask weight: 1.0
Version: v1.9.4

Let's Prompt

Here are some basic prompts that can help you get started:

A highly detailed portrait of a woman.
A highly detailed portrait of a man.
A highly detailed portrait of a child.
A highly detailed portrait of a person.

Type in prompt:

A highly detailed portrait of a woman

Not quite what we expected, let's add more details to the prompt:

A highly detailed portrait of a woman standing next to a window of a bridal dress shop.

It is getting better, let's add more details:

A highly detailed portrait of a young woman standing next to a window of a bridal dress shop, facing the camera

So the more details we add to the prompt, the better the result.

Now let's add add a negative prompt to prevent the model from generating unnecessary elements:

deformed face, disfigured, asymmetric, duplicate, tattoos, marks

Wow, now that is a lot better! But it is still not there yet.

Let's keep working on that:

A highly detailed portrait of a young woman standing next to a window of a bridal dress shop, facing the camera, ultra-realistic, cinematic lighting, DSLR sharp focus, soft shadows, perfect skin texture, 8K resolution, photorealism, HDR

Finger crossed!!!

How unfortunate that the model still generates disfigured faces.

Don't give up, let's keep working on that.

Change the parameters:

Steps: 20
Sampler: DPM2
Schedule type: Karras
CFG scale: 8
Seed: 13456
Size: 512x512
Model hash: c6bbc15e32
Model: sd-v1-5-inpainting
Conditional mask weight: 1.0
Version: v1.9.4

Change the prompt:

A highly detailed portrait of a young woman with highlight hair, standing next to a window of a bridal dress shop looking at the camera. The face is illuminated by soft, natural lighting, highlighting lifelike skin texture, expressive eyes, and a natural facial expression. The image has a cinematic look, with balanced contrast, depth of field, studio lighting, ultra-realistic, cinematic lighting, DSLR sharp focus, soft shadows, perfect skin texture, 8K resolution, photorealism, dslr, ultra quality, tack sharp, dof, film grain, Fujifilm XT3, crystal clear,

We throw lots of fancy words in it

Change the negative prompt:

deformed face, disfigured, asymmetric, duplicate, tattoos, marks, Extra heads, extra faces, multiple faces, extra eyes, ugly, bad, immature, cartoon, anime, 3d, painting, b&w

Tadah!

Finally that is something we can use.

I suspect words like "film grain, Fujifilm XT3" are making the difference.

So with the same negative prompt, we are gonna change the prompt to:

A highly detailed portrait of a young woman with highlight hair, standing next to a window of a bridal dress shop. She is looking at the camera, so we can see her eyes. full torso, shoulders and chest visible. The face is illuminated by soft, natural lighting, highlighting lifelike skin texture, expressive eyes, and a natural facial expression. The image is symmetric and has a cinematic look, with balanced contrast, studio lighting, ultra-realistic, DSLR sharp focus, soft shadows, perfect skin texture, ultra quality, dof, film grain, Fujifilm XT3, crystal clear,

Not bad.

Now we want to make an image of a different size with aspect ratio of 16:9, so we can use this image for the marketing material of a bridal dress shop.

Oops, new issue!

The image has two figures.

This is a common issue with Stable Diffusion, we could either 1) stay with size of 512x512, or 2) adjust both the positive and the negative prompts, change the CFG scale, seeding, the sampling method or lots of image generations.

Now the new parameters are:

Steps: 20
Sampler: DPM++ 2M
Schedule type: Karras
CFG scale: 15
Seed: 457
Size: 910x512
Model hash: c6bbc15e32
Model: sd-v1-5-inpainting
Conditional mask weight: 1.0
Version: v1.9.4

New prompt:

A highly detailed portrait of a young woman with highlight hair, standing next to a window of a bridal dress shop. She is looking at the camera, so we can see her eyes. full torso, shoulders and chest visible. The face is illuminated by soft, natural lighting, highlighting lifelike skin texture, expressive eyes, and a natural facial expression. In the background, on the woman's right, there is a table. The image has a cinematic look, with balanced contrast, studio lighting, ultra-realistic, DSLR sharp focus, soft shadows, perfect skin texture, ultra quality, dof, film grain, Fujifilm XT3, crystal clear,

New negative prompt:

deformed face, disfigured, asymmetric, duplicate, tattoos, marks, Extra heads, extra faces, multiple faces, extra eyes, ugly, bad, immature, cartoon, anime, 3d, painting, b&w, multiple figures, multiple heads

The idea of adding the table in the description is to fill up the space so Stable Diffusion won't generate two heads.

Finally we have a nice one:

More Examples

Red haired woman

Prompt:

A photograph of a red haired woman, realistic face, symmetrical, highly detailed, ultra HD

Negative prompt:

Extra heads, extra faces, multiple faces, extra eyes, deformed face, disfigured, asymmetric, duplicate, tattoos, marks

Parameters Used:

Steps: 139
Sampler: DPM++ 3M SDE
Schedule type: Karras
CFG scale: 12
Seed: 655
Size: 512x768
Model hash: c6bbc15e32
Model: sd-v1-5-inpainting
Conditional mask weight: 1.0
Version: v1.9.4

A Young Woman

Prompt:

a young woman with gorgeous face, delicate features, symmetrical, highly detailed

Negative prompt:

Extra heads, extra faces, multiple faces, extra eyes, deformed face, disfigured, asymmetric, duplicate, tattoos, marks

Parameters Used:

Steps: 20
Sampler: UniPC
Schedule type: Karras
CFG scale: 16
Seed: 655
Size: 512x512
Model hash: c6bbc15e32
Model: sd-v1-5-inpainting
Conditional mask weight: 1.0
Version: v1.9.4

In order to get one you really like, you may have to repeat this process until you get a good result. If you don't want to go through this tedious process, you can always use a model that is specifically trained to generate realistic human portraits.

Choosing the Right Model

The base Stable Diffusion model may not always generate photorealistic people without proper prompting. Instead, consider using:

F222: Known for better face generation.
RealisticVision: A popular choice for photorealistic AI art.
ChilloutMix: Good for anime-style and realistic portraits, especially for Asian facial features.
Custom LoRAs: Fine-tuned weights designed for specific enhancements in face rendering.

In this article, we will test generating human portraits with model F222.

Model Installation

F222

Presumably you are running Stable Diffusion with Stable Diffusion WebUI Docker with Automatic1111 UI.

In the directory where you have installed Stable Diffusion WebUI docker, find the folder data/models/Stable-diffusion, you can see the following files:

-rw-r--r-- 1 root root 4265437280 Feb  4 13:39 sd-v1-5-inpainting.ckpt
-rw-r--r-- 1 root root 4265380512 Feb  4 13:39 v1-5-pruned-emaonly.ckpt

So go to the folder and run the following command to download the f222 model.

cd data/models/Stable-diffusion
wget https://huggingface.co/acheong08/f222/resolve/main/f222.ckpt

After downloading, on the top left corner of the UI, click on the refresh button, then you can see the installed f222 model shown in the dropdown menu.

ChilloutMix

cd data/models/Stable-diffusion

Assume you have curl installed.

curl https://civitai.com/api/download/models/11745 -o ChilloutMix.safetensors

Or with wget

wget https://civitai.com/api/download/models/11745
mv 11745 ChilloutMix.safetensors

If you dont have curl installed, you can download the model from Civitai and save it to data/models/Stable-diffusion.

Testing the Model

After selecting the f222 model, we can generate the image with the same prompts as before.

This model does generate nice images, but it cannot eliminate the extra figure in the image completely. So we need to run it a few times to find some good ones.

Making it a Billboard Ad for a Bridal Dress Shop

Since we have the ideal image, we can create a billboard ad for a bridal dress shop.

Final Thoughts

Mastering human portraits with Stable Diffusion AI requires patience and practice. By selecting the right model, refining your prompts, and leveraging additional tools like ControlNet and inpainting, you can achieve stunning, lifelike AI-generated portraits. Experiment with different settings and techniques to improve your results over time.

Are you ready to create your own masterpiece? Start experimenting with Stable Diffusion today and push the boundaries of AI-generated human portraits! For details about how to run Stable Diffusion with user-friendly UI, check out Using Stable Diffusion WebUI Docker for Image Generation – A Newbie's Guide.

If you need to create a new prompt from scratch and choose an appropriate negative prompt, consider using a text-to-image generation prompt helper like the one we provide at Tynion.

References

How to generate realistic people in Stable Diffusion

Get in touch

For further information, I can be reach via:

X (formally twitter): Eric Tang
email: [email protected]

Mastering Human Portraits with Stable Diffusion AIeric

Keywords

Mastering Human Portraits with Stable Diffusion AI

Understanding Stable Diffusion for Portraits

Let's Dive in

Parameters

Let's Prompt

More Examples

Red haired woman

Prompt:

Negative prompt:

Parameters Used:

A Young Woman

Prompt:

Negative prompt:

Parameters Used:

Choosing the Right Model

Model Installation

F222

ChilloutMix

Testing the Model

Making it a Billboard Ad for a Bridal Dress Shop

Final Thoughts

References

Get in touch

Latest Articles

Unleash the Power of Puppeteer: A Deep Dive into the TYO-Crawler Web Crawler

Building a Retrieval-Augmented Generation System Using Open-Source Tools

Building an Affordable AI Machine with Great Scalability (256GB+ Memory)

Building Your Own AI Rig: More Memory, More Power

Exploring the Best Free and Open-Source Chat UIs for LLMs

Tags

Previous Article

Creating Celebrity Look-Alike Images with Stable Diffusion

Next Article

Creating a Human Portrait with AI: Which Model Does It Best?

Mastering Human Portraits with Stable Diffusion AIeric

Share This:

Keywords

Mastering Human Portraits with Stable Diffusion AI

Understanding Stable Diffusion for Portraits

Let's Dive in

Parameters

Let's Prompt

More Examples

Red haired woman

Prompt:

Negative prompt:

Parameters Used:

A Young Woman

Prompt:

Negative prompt:

Parameters Used:

Choosing the Right Model

Model Installation

F222

ChilloutMix

Testing the Model

Making it a Billboard Ad for a Bridal Dress Shop

Final Thoughts

References

Get in touch

Share This:

Latest Articles

Unleash the Power of Puppeteer: A Deep Dive into the TYO-Crawler Web Crawler

Building a Retrieval-Augmented Generation System Using Open-Source Tools

Building an Affordable AI Machine with Great Scalability (256GB+ Memory)

Building Your Own AI Rig: More Memory, More Power

Exploring the Best Free and Open-Source Chat UIs for LLMs

Tags

Previous Article

Creating Celebrity Look-Alike Images with Stable Diffusion

Next Article

Creating a Human Portrait with AI: Which Model Does It Best?