How to Build a Virtual Makeup Try-On (VTO) App: A Step-by-Step Guide

virtual try on tutorial

Virtual Try-On (VTO) is transforming how users interact with products online, especially in the beauty and fashion industries. Building a smooth, real-time VTO experience directly in the web browser might seem daunting, but with modern JavaScript libraries, it’s more accessible than you think.

In this guide, we’ll walk through the development of a complete, camera-based makeup VTO web application. We’ll cover everything from live face tracking and 3D overlays to building a polished, interactive UI for product selection and sharing. The best part? This entire project runs as static assets, with no complex frameworks or bundlers required.

What We’re Building

Here’s a look at the key features we’ll implement:

  • Live Makeup Overlays: A real-time camera view that applies foundation, blush, eyeshadow (including glitter effects), and lipstick.

  • Core Tech: We’ll use Three.js for 3D rendering and MindAR for robust, in-browser face tracking.

  • Comparison UI: A “before” and “after” preview panel with a draggable divider.

  • Dynamic Product Selector: A clean UI with tabs and carousels, populated dynamically from a JSON file.

  • Capture & Share: One-tap snapshot, download, and native sharing capabilities.

Prerequisites

  • Browser: Latest Chrome/Edge/Safari with webcam support

  • Runtime: No backend required — static hosting is enough

  • Libraries:

    • [MindAR face tracking] (via mindar-face-three)

    • [Three.js] (core, plus post‑processing helpers)

    • jQuery (lightweight DOM/event helpers)

    • Bootstrap (responsive layout + modal)

    • Owl Carousel (horizontal product carousel)

    • Howler.js (optional UI sfx)

    • dom‑to‑image (optional future capture path)

Tip: Keep everything client‑side to minimize friction and maximize privacy. Users click a link, grant the camera, and go.

Project Structure

A well-organized project is easy to maintain. Here’s the simple layout we’ll use:

				
					/
|-- vto.html                # The main (and only) HTML page
|-- template.json           # Product data (colors, images, categories)
|
|-- web_files/
|   |-- custom.js           # Our main application logic (AR, UI, capture)
|   |-- style.css           # All custom styles
|   |-- (vendor scripts...) # Bootstrap, jQuery, Owl Carousel, etc.
|
|-- assets/
|   |-- js/                  # Core libraries (Three.js, MindAR, WASM)
|   |-- images/             # UI icons and product images
|
|-- facemap_1024/
|   |-- fullface.png        # Texture masks for facial regions
|   |-- eyeshadow.png
|   |-- lipstick.png
|   |-- (and so on...)
				
			

1) HTML skeleton & module wiring

We start with vto.html. This file holds the DOM structure and, most importantly, uses an import map. This modern browser feature lets us use ES module import statements (like import * as THREE from 'three') and map them directly to our local library files, completely avoiding the need for a bundler like Webpack or Vite.

Here’s the essential setup in vto.html:

				
					<link rel="stylesheet" href="./web_files/bootstrap.min.css" />
<link rel="stylesheet" href="./web_files/style.css" />
<link rel="stylesheet" href="./web_files/owl.carousel.min.css" />

<div id="webar_container">
  <div class="makeup-vto"></div>
  
  <img decoding="async" class="arrow-divider" src="./web_files/arrow-diff.png" draggable="false" />
  <div class="divider"><canvas id="cloned_video_canvas"></canvas></div>
  
  <div class="lighting-detect">Lighting too dark</div>
  
  </div>

<script type="importmap">
  {
    "imports": {
      "three": "./assets/js/three.module.min.js",
      "three/addons/": "./assets/js/three/addons/",
      "effectComposer": "./assets/js/three/addons/postprocessing/EffectComposer.js",
      "renderPass": "./assets/js/three/addons/postprocessing/RenderPass.js",
      "shaderPass": "./assets/js/three/addons/postprocessing/ShaderPass.js",
      "horizontalBlurShader": "./assets/js/three/addons/shaders/HorizontalBlurShader.js",
      "verticalBlurShader": "./assets/js/three/addons/shaders/VerticalBlurShader.js",
      "mindar-face-three": "./assets/js/mindar-face-three.prod.js"
    }
  }
</script>

<script src="./web_files/jquery.js"></script>
<script type="module" src="./web_files/custom.js"></script>
				
			

2) Initializing Face Tracking and 3D Scene

Now we move to web_files/custom.js. The first step is to import our dependencies and initialize MindARThree. This one object conveniently sets up the Three.js renderer, scene, and camera for us, automatically linking them to the face tracking feed.

				
					import * as THREE from 'three';
import { MindARThree } from 'mindar-face-three';
// ... import post-processing modules ...

// Grab the container element from our HTML
const container = document.querySelector('.makeup-vto');

// Initialize MindAR
const mindarThree = new MindARThree({
  container,
  // Filtering parameters to stabilize the mesh tracking
  filterMinCF: 0.0001,
  filterBeta: 10000,
});

// Destructure the core Three.js components
const { renderer, scene, camera } = mindarThree;
				
			

3) Layering Makeup with Multiple Face Meshes

A key insight for this VTO is that we don’t use just one face mesh. To control each makeup component (lipstick, blush, eyeshadow) independently, we add a separate faceMesh object for each one. This allows us to toggle visibility, change color, or adjust the opacity of the lipstick without affecting the eyeshadow.

				
					// We create a mesh for each distinct makeup layer
const faceMesh = mindarThree.addFaceMesh();
const eyeshadowMesh = mindarThree.addFaceMesh();
const eyeshadowGlitterMesh = mindarThree.addFaceMesh();
const blushonMesh = mindarThree.addFaceMesh();
const lipstickMesh = mindarThree.addFaceMesh();
const lipstickGlossyMesh = mindarThree.addFaceMesh();
				
			

4) Applying Textures and Materials

These meshes are invisible until we give them materials. The “magic” of VTO comes from applying specially crafted texture masks. These are grayscale images (from our facemap_1024/ folder) where white/gray areas define where the makeup should appear (e.g., just on the lips) and black areas remain transparent.

We load these masks and create standard MeshBasicMaterial instances for them, setting transparent: true to allow them to overlay the video feed.

				
					const textureLoader = new THREE.TextureLoader();

// Load all our alpha masks
let fullFaceTexture       = textureLoader.load('./facemap_1024/fullface.png');
let eyeshadowTexture      = textureLoader.load('./facemap_1024/eyeshadow.png');
let eyeshadowGlitterTexture = textureLoader.load('./facemap_1024/glitter.png');
let blushTexture          = textureLoader.load('./facemap_1024/blushon.png');
let lipstickTexture       = textureLoader.load('./facemap_1024/lipstick.png');

// Create materials for each mask
const faceMaterial = new THREE.MeshBasicMaterial({ 
  map: fullFaceTexture, 
  opacity: 0.16, 
  transparent: true 
});
const eyeshadowMaterial = new THREE.MeshBasicMaterial({ 
  map: eyeshadowTexture, 
  opacity: 0.35, 
  transparent: true, 
  visible: false // Start as invisible
});
const blushMaterial = new THREE.MeshBasicMaterial({ 
  map: blushTexture, 
  opacity: 0.17, 
  transparent: true, 
  visible: false 
});
// ... other materials ...

// Assign materials to their corresponding meshes
faceMesh.material = faceMaterial;
eyeshadowMesh.material = eyeshadowMaterial;
blushonMesh.material = blushMaterial;
lipstickMesh.material = lipstickMaterial;
				
			

Spotlight: The Animated Glitter Shader

For the eyeshadow glitter, a static texture isn’t enough. We want it to sparkle! To do this, we use a custom ShaderMaterial. This lets us write GLSL (shader) code directly.

The fragment shader uses a random function based on the pixel’s UV coordinates and a time uniform (which we’ll update in our render loop) to make random pixels flash brightly, creating a shimmering effect.

				
					const eyeshadowGlitterMaterial = new THREE.ShaderMaterial({
  // 'time' is a uniform we can update from JavaScript to create animation
  uniforms: { 
    baseTexture: { value: eyeshadowGlitterTexture }, 
    time: { value: 0 } 
  },
  vertexShader: `
    varying vec2 vUv; 
    void main(){
      vUv = uv; 
      gl_Position = projectionMatrix * modelViewMatrix * vec4(position, 1.0);
    }`,
  fragmentShader: `
    uniform sampler2D baseTexture; 
    varying vec2 vUv; 
    uniform float time; 
    
    // Simple pseudo-random function
    float random(vec2 st){
      return fract(sin(dot(st.xy, vec2(12.9898, 78.233)) + time) * 43758.5453123);
    } 

    void main(){ 
      vec4 color = texture2D(baseTexture, vUv); 
      float rnd = random(vUv * 5000.0); 
      if(rnd > 0.91) { // 9% of pixels
        color.rgb += vec3(1.5); // Make them brighter (sparkle)
      }
      gl_FragColor = color; 
    }`,
  transparent: true, 
  visible: false,
});

eyeshadowGlitterMesh.material = eyeshadowGlitterMaterial;
				
			

5) Compositing Layers with Post-Processing

Simply rendering all meshes into one scene can look harsh. To achieve a softer, more blended makeup look, we use post-processing.

We create dedicated scenes and renderers for different layers. This gives us fine-grained control. For example, we can render the foundation layer, apply a heavy blur to it, and then composite it on top of the video, followed by the less-blurry lipstick and glitter layers.

We use EffectComposer to chain these render passes together. Here, we add a RenderPass (to render the scene) followed by ShaderPass instances for horizontal and vertical blurs.

				
					// Dedicated scenes for different logical layers
const glitterScene = new THREE.Scene();
glitterScene.add(eyeshadowMesh, eyeshadowGlitterMesh);

const faceScene = new THREE.Scene();
faceScene.add(faceMesh);

// Add remaining meshes to the main scene
scene.add(lipstickMesh, lipstickGlossyMesh, blushonMesh);

// Create extra canvases and renderers for our layers
const faceCanvas = document.createElement('canvas');
faceCanvas.id = 'full_face_canvas';
container.appendChild(faceCanvas);
const faceRenderer = new THREE.WebGLRenderer({ canvas: faceCanvas, alpha: true });

// ... create glitterCanvas and glitterRenderer ...

// Post-processing for the face (foundation) layer
const faceComposer = new EffectComposer(faceRenderer);
faceComposer.addPass(new RenderPass(faceScene, camera));
faceComposer.addPass(new ShaderPass(HorizontalBlurShader));
faceComposer.addPass(new ShaderPass(VerticalBlurShader));

// Post-processing for the glitter layer
const glitterComposer = new EffectComposer(glitterRenderer);
glitterComposer.addPass(new RenderPass(glitterScene, camera));

// Post-processing for the main makeup layer
const composer = new EffectComposer(renderer);
composer.addPass(new RenderPass(scene, camera));
composer.addPass(new ShaderPass(HorizontalBlurShader));
composer.addPass(new ShaderPass(VerticalBlurShader));
				
			

6) The Render Loop and Lighting Detection

With everything set up, we start MindAR and begin our animation loop using renderer.setAnimationLoop.

Inside the loop, we do two key things:

  1. Render each of our composers (composer, faceComposer, glitterComposer).

  2. Perform simple lighting detection. We sample the video frame, calculate the average brightness, and if it’s too dark (e.g., avg < 90), we hide the overlays and show the “Lighting too dark” message. This prevents the makeup from looking unnatural in poor lighting.

				
					const start = async () => {
  // Start MindAR (this will request camera access)
  await mindarThree.start();
  
  // Start the render loop
  renderer.setAnimationLoop(() => {
    // ... update canvas sizes ...

    // Render all our effect composers
    composer.render();
    faceComposer.render();
    glitterComposer.render();

    // --- Lighting Detection ---
    // (Simplified: get image data from video canvas context)
    const imageData = /* ctx.getImageData(...) */;
    const avg = calculateAverageBrightness(imageData);
    const tooDark = avg < 90;

    // Toggle visibility based on lighting
    document.querySelector('.lighting-detect').style.display = tooDark ? 'block' : 'none';
    scene.visible = glitterScene.visible = faceScene.visible = !tooDark;
  });
};
start();
				
			

7) Dynamically Changing Makeup Colors

This is where the app becomes interactive. When a user clicks a lipstick swatch, we need to change the color of our lipstickTexture without replacing the mask itself.

We do this by:

  1. Caching: The first time we modify a texture, we draw it to a canvas and store its original pixel data in an object (originalImageData).

  2. Recoloring: To apply a new color, we retrieve the original pixels, put them on a new canvas, and iterate through the imageData.

  3. Pixel Logic: For each pixel, we check its brightness. If it’s not pure white (brightness > 200) or transparent, we replace its R, G, and B values with the new, gamma-corrected color.

  4. Gamma Correction: We apply gamma correction (Math.pow(value, 1 / 1.6)) to the input color. This makes the colors look more natural and less “digital” when applied to the mask.

  5. Updating: We set the texture’s image source to the new canvas’s dataURL and flag it for update.

				
					const originalImageData = {};
const gammaFactorValue = 1.6;

function applyGammaCorrection(value, gammaFactor = gammaFactorValue){
  return Math.pow(value, 1 / gammaFactor);
}

function changeTextureColor(texture, colorHex, material){
  if (!texture.image) return; // Wait for texture to load

  // Cache the original, clean texture data if we haven't already
  if (!originalImageData[texture.uuid]) {
    const c = document.createElement('canvas');
    c.width = texture.image.width; c.height = texture.image.height;
    c.getContext('2d').drawImage(texture.image, 0, 0);
    originalImageData[texture.uuid] = c.getContext('2d').getImageData(0, 0, c.width, c.height);
  }

  // Create a new canvas to draw the modified texture
  const c = document.createElement('canvas');
  c.width = texture.image.width; c.height = texture.image.height;
  const ctx = c.getContext('2d');
  
  // Start with the pristine, original mask data
  ctx.putImageData(originalImageData[texture.uuid], 0, 0);

  const imageData = ctx.getImageData(0, 0, c.width, c.height);
  const data = imageData.data;
  const color = new THREE.Color(colorHex);
  
  // Apply gamma correction to the target color
  const r = applyGammaCorrection(color.r) * 255;
  const g = applyGammaCorrection(color.g) * 255;
  const b = applyGammaCorrection(color.b) * 255;

  // Loop through all pixels (4 bytes at a time: R,G,B,A)
  for (let i = 0; i < data.length; i += 4) {
    const R = data[i], G = data[i+1], B = data[i+2];
    const brightness = (R + G + B) / 3;
    
    // If the pixel is part of the mask (not white/transparent)
    if (brightness <= 200 && !(R === 255 && G === 255 && B === 255)) {
      // Apply the new color
      data[i] = r; 
      data[i+1] = g; 
      data[i+2] = b;
    }
  }
  
  // Put the modified pixel data back on the canvas
  ctx.putImageData(imageData, 0, 0);
  
  // Update the Three.js texture with the new image
  texture.image.src = c.toDataURL();
  if (material) { 
    material.visible = true; // Make sure the layer is visible
    material.needsUpdate = true; 
  }
  texture.needsUpdate = true;
}
				
			

8) Populating the UI from template.json

Our UI is data-driven. We define all products, shades, and categories in template.json.

				
					{
  "complexion": [
    { "code": "#21C", "item": [
      { "image": "./assets/images/complexion/21C-1.png", "color": "#EDB3A0", "type": "complexion" }
    ]}
  ],
  "season": [
    { "season": "spring", "tone": "warm", "lips": [
      { "category": "Matte Lip Cream", "item": [
        { "color": "#C47066", "code": "30", "name": "Amberella", "image": "./assets/images/products/lipcream/30-amberella.png" }
      ]}
    ]}
  ]
}
				
			

In our JavaScript, we fetch this file, parse it, and dynamically generate the HTML for the tabs and carousels, attaching click listeners to the swatches that call changeTextureColor(...) with the appropriate color.

9) Building the “Before & After” Divider

The split-screen comparison is a powerful feature. It’s composed of three elements:

  1. .makeup-vto: The container for our AR canvases (the “after” view).

  2. .divider: A container holding a canvas that just shows the mirrored, unprocessed video (the “before” view). It’s positioned on top of .makeup-vto.

  3. img.arrow-divider: The draggable handle.

We use jQuery to listen for mousedown or touchstart events on the divider, mousemove/touchmove to update its width, and mouseup/touchend to stop dragging.

				
					/* style.css */
.divider { 
  position: absolute; 
  top: 0; 
  left: 0; 
  width: 50%; /* Starts halfway */
  height: 100%; 
  z-index: 2; 
  cursor: ew-resize; 
}
img.arrow-divider { 
  width: 24px; 
  cursor: pointer; 
  left: 50%; 
  position: absolute; 
  top: 50%; 
  transform: translateY(-50%); 
  z-index: 3; 
}
.makeup-vto { 
  position: absolute; 
  left: 0; 
  top: 0; 
  width: 100%; 
  height: 100%; 
}
				
			
				
					// custom.js (Simplified)
$('.divider, img.arrow-divider').on('mousedown touchstart', function (e) {
  e.preventDefault();
  // ... store initial mouse position and divider width ...
  $(document).on('mousemove touchmove', resizeElement);
  $(document).on('mouseup touchend', stopResize);
});

function resizeElement(e) {
  // ... calculate new width based on mouse movement and apply it ...
}
				
			

10) Capturing and Sharing the Look

Finally, we need to capture the result. Instead of a simple screenshot, we manually compose the final image for maximum control. This lets us apply filters and effects only to the capture, not the live feed.

When takeScreenshot is called:

  1. We create a new output canvas.

  2. Layer 1: Draw the raw video (mirrored).

  3. Layer 2: Draw the faceRenderer canvas (foundation) with a blur(7px) saturate(1.7) filter. This creates a soft, airbrushed complexion.

  4. Layer 3: Draw the glitterRenderer canvas.

  5. Layer 4: Draw the main renderer canvas (lipstick, blush) with a slight contrast pop.

  6. We convert this composite canvas to a dataURL and display it in our Bootstrap modal.

				
					function takeScreenshot(callback){
  // Ensure all renderers are up-to-date
  renderer.render(scene, camera);
  faceRenderer.render(faceScene, camera);
  glitterRenderer.render(glitterScene, camera);

  const out = document.createElement('canvas');
  // ... set width/height ...
  const ctx = out.getContext('2d');

  // 1) Video (mirrored)
  ctx.save(); 
  ctx.translate(out.width, 0); 
  ctx.scale(-1, 1);
  ctx.drawImage(canvasVideo, 0, 0, out.width, out.height); 
  ctx.restore();

  // 2) Face canvas (soft complexion)
  ctx.save(); 
  ctx.filter = 'blur(7px) saturate(1.7)'; 
  ctx.globalAlpha = 0.93;
  ctx.drawImage(faceRenderer.domElement, 0, 0, out.width, out.height); 
  ctx.restore();

  // 3) Glitter overlay
  ctx.drawImage(glitterRenderer.domElement, 0, 0, out.width, out.height);

  // 4) Makeup canvas (lipstick, blush)
  ctx.save(); 
  ctx.filter = 'blur(2px) brightness(75%) contrast(190%)';
  ctx.drawImage(renderer.domElement, 0, 0, out.width, out.height); 
  ctx.restore();

  const dataUrl = out.toDataURL('image/png');
  // ... set image src and show modal ...
}
				
			

For sharing, we use the Web Share API, which provides a native share sheet on supported devices.

				
					async function shareImage(){
  const dataUrl = document.querySelector('.image-captured').src;
  const blob = await (await fetch(dataUrl)).blob();
  const file = new File([blob], 'makeup-vto.png', { type: 'image/png' });
  
  if (navigator.share) {
    await navigator.share({ files: [file] });
  }
}
				
			

Running Your VTO App

To run this project, you need a simple local server. A critical requirement for accessing a user’s web camera in modern browsers is a secure origin. This means your application must be served over https or from http://localhost.

If you just run the vto.html file directly from your filesystem (e.g., file:///...), the browser will block camera access.

A quick way to create a local http server for testing is with Node.js: npx http-server

Then just open http://localhost:8080/vto.html in your browser. When deploying to a live server, you must use an HTTPS certificate for the camera functionality to work.

Conclusion

You now have a complete, high-performance web-based VTO application. We’ve combined MindAR’s face tracking with the flexibility of Three.js to create a layered makeup system. By using multiple render targets, post-processing, and dynamic texture recoloring, we’ve built a polished experience that is easily customizable via a simple JSON file. All of this, running directly in the browser with no build step required.

See the application live in action by checking out our demo, or download the source code to get started:

Need Help with Your Custom Software Project?

We hope this technical guide was helpful! At Keption, we specialize in building custom, high-performance software applications, from complex web platforms like this VTO experience to enterprise-grade IT solutions.

If you’re looking for expert IT consulting or need a dedicated development team to bring your digital product to life, we’re here to help.

Contact us today to discuss your software needs!