After doing a lot of experimentation with various types of classification problems and regression problems, we decided to start building our own prototype to estimate body dimensions based on a single image.
The data-set that was used to train this model, consists of images that were derived from 3D models in the CAESAR database. The 3D models in this database came in the Polygon file format, which contains information about the object’s vertices and faces.
We used the information in these files to construct 2D silhouettes from a certain angle. The end result of the constructed images looks like the following, where we have a frontal and side view:
The example above already shows a few flaws. Firstly, there are a few white pixels in the middle of people’s bodies, mostly around their hands, feet or head. Another thing that can be seen in these images, are some extra pixels around a human’s body that shouldn’t be there. In the example above, there are extra pixels around the person’s waist that are not supposed to be there. The biggest flaw that the transformed images have, is the lack of shadows that would normally give an indication of depth. We are not certain that this is necessary for determining the sizes of various body parts, so for now we’re going to focus on just the silhouette. In the future, we will also experiment with different images that include shadows.
For this first prototype, only the frontal view was used to train the model. Since the images are all the same size and the models are scaled to the model’s size, we needed a way to determine scale within the image. To add some sense of scale to the image, we decided to embed the value of the height of the person into the image’s binary data.
The variables that we want to predict, are the following:
y_columns = ['Stature (mm)', 'Neck Base Circumference (mm)', 'Chest Girth (Chest Circumference at Scye) (mm)', 'Bust/Chest Circumference Under Bust (mm)', 'Waist Circumference, Pref (mm)', 'Hip Circumference, Maximum (mm)', 'Thigh Circumference (mm)', 'Shoulder Breadth (mm)', 'Arm Length (Shoulder to Elbow) (mm)', 'Arm Length (Shoulder to Wrist) (mm)', 'Hand Circumference (mm)', 'Hand Length (mm)', 'Head Circumference (mm)', 'Head Length (mm)', 'Crotch Height (mm)', 'Ankle Circumference (mm)']
More variables are available, but these seem to be the most important for tailored clothing. These can be removed or new ones can be added in the future, if needed.
Since this is the first prototype, we decided to simply come up with a model and see where it went from there. In future prototypes and experiments, we are going to change the model by adding layers, removing layers and changing how the layers are configured.
This is our first attempt at creating a model for our prototype. This was chosen completely at random.
Future experimentation will be done based on research into other projects that also do human detection and comparisons between the results of each model.
The input layer consists of a 256 by 256 pixel input. The final layer (Dense) provides an output of 15 variables that we attempt to predict. The layers in between were chosen mostly at random and these are going to be experimented with a lot.
While training, we noticed that the mean squared error kept decreasing, the longer the algorithm was training, where the result after 1000 epochs was reaching a rather consistent value of around 512.0382 MSE. Based on this, we thought the algorithm had successfully learned how to predict body measurements within a range of roughly 500 millimeters.
Upon using the trained model on testing data, the MSE that we obtained was 24577.8300781. To further see how wrong the trained model was, we used a sample image that we had excluded from the training and testing data for input. The body measurements that we retrieved from the sample input was the following:
Neck Base Circumference (mm): 98921.03 Chest Girth (Chest Circumference at Scye) (mm): 195441.33 Bust/Chest Circumference Under Bust (mm): 173896.53 Waist Circumference, Pref (mm): 172310.03 Hip Circumference, Maximum (mm): 214456.62 Thigh Circumference (mm): 121626.22 Shoulder Breadth (mm): 96353.445 Arm Length (Shoulder to Elbow) (mm): 75532.664 Arm Length (Shoulder to Wrist) (mm): 135804.39 Hand Circumference (mm): 43767.73 Hand Length (mm): 42001.52 Head Circumference (mm): 118117.125 Head Length (mm): 40810.56 Crotch Height (mm): 186664.56 Ankle Circumference (mm): 56043.742
Whereas the correct measurements were quite different:
Neck Base Circumference (mm): 467 Chest Girth (Chest Circumference at Scye) (mm): 888 Bust/Chest Circumference Under Bust (mm): 766 Waist Circumference, Pref (mm): 784 Hip Circumference, Maximum (mm): 924 Thigh Circumference (mm): 520 Shoulder Breadth (mm): 413 Arm Length (Shoulder to Elbow) (mm): 314 Arm Length (Shoulder to Wrist) (mm): 585 Hand Circumference (mm): 192 Hand Length (mm): 196 Head Circumference (mm): 546 Head Length (mm): 197 Crotch Height (mm): 733 Ankle Circumference (mm): 262
When comparing the actual measurements with the predictions from our model, the following results are measured:
|Measurement||Predicted (mm)||Actual (mm)||Percentage (%)|
|Neck Base Circumference||98921.03||467||21182.2334047|
|Chest Girth (Chest Circumference at Scye)||195441.33||888||22009.1587838|
|Bust/Chest Circumference Under Bust||173896.53||766||22701.8968668|
|Waist Circumference, Pref||172310.03||784||21978.3201531|
|Hip Circumference, Maximum||214456.62||924||23209.5909091|
|Arm Length (Shoulder to Elbow)||75532.664||314||24054.988535|
|Arm Length (Shoulder to Wrist)||135804.39||585||23214.425641|
As can be clearly seen, all of the values have been estimated roughly 200-250 times the actual amount of millimeters. We’re not sure where this comes from, but it’s quite interesting to see that all of the parameters follow the same pattern, which means that there is an issue somewhere in either the data-set or the model, but when that issue is solved, all of the measurements will at least follow the same patterns to improving.
After testing, a graph was automatically generated, containing a plot of the mean squared error over the epochs. In this case, it doesn’t produce any information that is usable for us, other than that it went down rapidly and continued hobbling up and down at the lower numbers.
We learned how to set prepare a data-set for usage within a model. Furthermore, we learned how to set up a model that works with the data-set and how to set up a proper testing framework with automated graph generation.
While this was quite a learning experience in itself, there are some more things that we can do in the future to improve the prototype and these will be some things that we’re going to take a look at in the next few weeks.
- Ensure that the input is prepared correctly and that the used Imputation does not produce false results
- Switch to cross-validation rather than splitting training and test
- Ensure that we do not have any data leakage
- Checking to see if we can use pipelines and if they’re helpful in our case
- Ensure that the training graphs are usable by adding more graphs and fixing the existing graph
Besides these few things that we will change, we’re also going to be doing a lot of experimentation with different layers within the model. We will do research into other projects that do similar things, so that we can learn from their decision making and choose to follow similar setups.