第三章：训练图像估计光照度算法模型

Red Linux 2024-11-06 3396

描述

前言

这一篇就到了图像估计光照度算法章节，这篇我主要记录如何使用 tensorflow2 训练一个从图片中估计光照度的算法。一般的流程是拍摄多张图片以及用光照度计来检测其光照度值，分别作为输入和输出。但是在本章呢，为了起到演示的作用，数据集我会使用 [MIT-Adobe FiveK Dataset] 。光照度值呢，我使用图片的 rgb 数值经过算法r0.2126+g0.7152+b*0.0722计算亮度。这样就有了一定数量的数据集。也就有基础进行后续的训练和测试了。下面准备进入正文。

数据获取

因为 MIT-Adobe FiveK Dataset 数据集包含了 5000 张原始 dng 图像和 5 和专家A,B,C,D,E进行处理之后的 tiff 图像(一般地，这个数据集是用来训练图像增强相关的模型的，我这里就用来训练光照度估计算法了，嘿嘿)。因为完整的数据压缩包太大了~50GB。受限电脑的容量和速度，我选择了使用脚本逐个下载这些图片（因为这些图片的下载路径有规律，再加上这些图片的名字在官网可以下载下来，所以脚本就读取包含图片名字的文件，然后逐个拼接下载路径，使用 curl 工具完成下载）。这里，我选择了下载原始 dng 图片和专家 C 的 tiff 图片。
下载 dng 原始文件的脚本是：

#!/usr/bin/bash

#改变当前工作路径
CURRENT_PATH="/home/red/Downloads/fivek_dataset/expertc"
#本文件所在路径
cd ${CURRENT_PATH}
#改变当前路径

#存储图像名称的list
img_lst=[]
#读取图片名列表
files_name=`cat filesAdobe.txt`
files_mit_name=`cat filesAdobeMIT.txt`

j=0
for i in ${files_name};do
    # https://data.csail.mit.edu/graphics/fivek/img/dng/a0001-jmac_DSC1459.dng
    URL='https://data.csail.mit.edu/graphics/fivek/img/dng/'${i}'.dng'
    file_cur=${URL##*/}
    echo "Downloading ${URL}@${j}"
    j=$((j+1))
    if [ -f ${file_cur} ];then
        echo "${file_cur} exist"
    else
        # echo "${file_cur} no exist, it's you"
        # break
        curl -O ${URL}
    fi
done

下载专家 C 处理后的文件脚本是：

#!/usr/bin/bash

#改变当前工作路径
CURRENT_PATH="/home/red/Downloads/fivek_dataset/expertc"
#本文件所在路径
cd ${CURRENT_PATH}
#改变当前路径

#存储图像名称的list
img_lst=[]
#读取图片名列表
files_name=`cat filesAdobe.txt`
files_mit_name=`cat filesAdobeMIT.txt`

j=0
for i in ${files_name};do
    #下载由 expert C 所调整的图像(可根据需要下载其它的四类图像)
    URL='https://data.csail.mit.edu/graphics/fivek/img/tiff16_c/'${i}'.tif'
    file_cur=${URL##*/}
    echo "Downloading ${URL}@${j}"
    j=$((j+1))
    if [ -f ${file_cur} ];then
        echo "${file_cur} exist"
    else
        echo "${file_cur} no exist, it's you"
        # break
        curl -O ${URL}
    fi
done

经过了好几天断断续续的下载，最后我一共得到了 1000 张左右图片。有了图片之后，下一步就是计算光照度了，这里使用 python 脚本和 pillow 包完成，为了后续移植到 AI300G 上，我将图片缩放到了统一的 255*255。并且将计算的光照度和图像的名称存储到一个 csv 文件。这部分脚本如下：

#!/bin/env python3

import sys
import csv
import os
import re

from PIL import Image

gs_illumiance_csv_file_fd=0
gs_illumiance_csv_file_name='illumiance_estimate.csv'
gs_illumiance_data_list=[['Name', 'Illuminance']]
DEST_DIR_NAME=r'PNG255'

def illuname_estimate(t):
    r,g,b=t
    return r*0.2126+g*0.7152+b*0.0722


def get_pic_pixels(pic_name):
    with Image.open(pic_name) as pic:
        ans=0
        pic=pic.resize((255,255))
        print(f'raw name:{pic_name}')
        match=re.match(r'w+/(S+).w+', pic_name)
        if match:
            basename=match.group(1)
            basename=DEST_DIR_NAME+'/'+basename+'.png'
            print(f'new name:{basename}')
            pic.save(basename)
            #  pic.show()
        width, heigh = pic.size
        for x in range(width):
            for y in range(heigh):
                r, g, b = pic.getpixel((x, y))
                ans=ans+illuname_estimate((r,g,b))

    # 光照度取整
    ans=round(ans)
    print(f'{pic_name}: illuname ans:{ans}')
    return ans

def insert_item(pic_name, illumiance_estimate):
    global gs_illumiance_csv_file_fd
    global gs_illumiance_csv_file_name
    global gs_illumiance_data_list
    item_template=['NONE', -1]
    item_template[0]=pic_name
    item_template[1]=illumiance_estimate
    gs_illumiance_data_list.append(item_template)

def do_with_dir(dir_name):
    for filename in os.listdir(dir_name):
        filepath=os.path.join(dir_name, filename)
        if (os.path.isfile(filepath)):
            print("do input %s" %(filepath))
            ans=get_pic_pixels(filepath)
            insert_item(filename, ans)
            #  return

if len(sys.argv) > 1:
    print("do input dir:%s" %(sys.argv[1]))
    if not os.path.exists(DEST_DIR_NAME):
        os.makedirs(DEST_DIR_NAME)
    do_with_dir(sys.argv[1])
    gs_illumiance_csv_file_fd=open(gs_illumiance_csv_file_name, 'w', newline='')
    csv.writer(gs_illumiance_csv_file_fd).writerows(gs_illumiance_data_list)
else:
    print("Please input pic name")

这样就得到了类似下面的数据集：

▸ head illumiance_estimate.csv
Name,Illuminance
a0351-MB_070908_006_dng.jpeg,3680630
a0100-AlexWed07-9691_dng.jpeg,1258657
a0147-kme_333.jpeg,5168820
a0261-_DSC2228_dng.jpeg,2571498
a0255-_DSC1448.jpeg,8747593
a0054-kme_097.jpeg,5351908
a0393-_DSC0040.jpeg,1783394
a0304-dgw_137_dng.jpeg,3118835
a0437-jmacDSC_0011.jpeg,6140107

至此有了一定数量的数据集（这里我使用了667张照片），接下来就是模型训练了。

模型训练

模型训练的基本思想就是，首先将数据集按比例(4:1)拆分为训练集和测试集，然后使用 tensorflow 建立模型训练参数进行检验。
大概流程是：

首先是根据 csv 文件建立 tensorflow dataset 格式的数据集；
建立模型使用数据集进行模型训练和测试

这部分代码为：

#!/usr/bin/python3.11

TF_ENABLE_ONEDNN_OPTS=0

import numpy as np
import os
import PIL
import PIL.Image
import tensorflow as tf
import pathlib
import csv
import pandas as pd
import tensorflow.data
import sys
import matplotlib.pyplot as plt

AUTOTUNE=tensorflow.data.AUTOTUNE
BATCH_SIZE=32
IMG_WIDTH=255
IMG_HEIGHT=255
ILLUMINACE_FILE=r'illumiance_estimate.csv'
print(tf.__version__)

import tensorflow as tf
import pandas as pd

image_count = len(os.listdir(r'JP'))
print(f'whole img count={image_count}')
# 假设CSV文件有两列：'image_path' 和 'label'
df = pd.read_csv(ILLUMINACE_FILE)

# 将DataFrame转换为TensorFlow可以处理的格式
image_paths = df['Name'].values
labels = df['Illuminance'].values
labels = labels.astype(np.float32)
labels /= 16777215.0

# 创建一个Dataset
gs_dataset = tf.data.Dataset.from_tensor_slices((image_paths, labels))

print(type(gs_dataset))
print(gs_dataset)
print(r'-------------------------------------------')
# 定义一个函数来加载和预处理图像
def load_and_preprocess_image(image_path, label):
    print(image_path)
    image_path='JP/'+image_path
    image = tf.io.read_file(image_path)
    image = tf.image.decode_jpeg(image, channels=3)
    image = tf.image.resize(image, [IMG_WIDTH, IMG_HEIGHT])
    #  image /= 255.0  # 归一化
    return image, label

# 应用这个函数到Dataset上
gs_dataset = gs_dataset.map(load_and_preprocess_image)
# 打乱数据
gs_dataset = gs_dataset.shuffle(image_count, reshuffle_each_iteration=False)

val_size = int(image_count * 0.2)

gs_train_ds = gs_dataset.skip(val_size)
gs_val_ds = gs_dataset.take(val_size)

def configure_for_performance(ds):
    ds = ds.cache()
    ds = ds.shuffle(buffer_size=1000)
    ds = ds.batch(BATCH_SIZE)
    ds = ds.prefetch(buffer_size=AUTOTUNE)
    return ds

gs_train_ds = configure_for_performance(gs_train_ds)
gs_val_ds = configure_for_performance(gs_val_ds)

image_batch, illuminance_batch = next(iter(gs_train_ds))

#  plt.figure(figsize=(10, 10))

#  for i in range(9):
  #  ax = plt.subplot(3, 3, i + 1)
  #  print(image_batch[i])
  #  #  img_data=image_batch[i].numpy()*255.0
  #  #  plt.imshow(img_data.astype("uint8"))
  #  plt.imshow(image_batch[i].numpy().astype("uint8"))
  #  illuminance = illuminance_batch[i]
  #  plt.title(illuminance.numpy())
  #  plt.axis("off")

#  plt.show()

#  sys.exit()

model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255),
  tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(1)
])

model.compile(
  optimizer='adam',
  loss='mean_squared_error')

model.fit(
  gs_train_ds,
  validation_data=gs_val_ds,
  epochs=12
)

model.save("illu_v01")

执行上述代码，可以看到最后的 loss 和 val_loss 为：

▸ ./train_tf2_v2.py
2024-08-08 13:41:48.341117: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-08 13:41:48.342596: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-08 13:41:48.363696: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-08 13:41:48.363729: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-08 13:41:48.364549: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-08 13:41:48.368601: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-08 13:41:48.368762: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-08 13:41:48.801750: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2.15.0
whole img count=667
2024-08-08 13:41:51.138713: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-08-08 13:41:51.139135: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
< class 'tensorflow.python.data.ops.from_tensor_slices_op._TensorSliceDataset' >
< _TensorSliceDataset element_spec=(TensorSpec(shape=(), dtype=tf.string, name=None), TensorSpec(shape=(), dtype=tf.float32, name=None)) >
-------------------------------------------
Tensor("args_0:0", shape=(), dtype=string)
Epoch 1/12
17/17 [==============================] - 11s 603ms/step - loss: 98.9302 - val_loss: 0.1012
Epoch 2/12
17/17 [==============================] - 8s 495ms/step - loss: 0.0493 - val_loss: 0.0043
Epoch 3/12
17/17 [==============================] - 8s 481ms/step - loss: 0.0078 - val_loss: 0.0043
Epoch 4/12
17/17 [==============================] - 8s 479ms/step - loss: 0.0025 - val_loss: 0.0040
Epoch 5/12
17/17 [==============================] - 8s 477ms/step - loss: 0.0023 - val_loss: 0.0029
Epoch 6/12
17/17 [==============================] - 8s 480ms/step - loss: 0.0021 - val_loss: 0.0028
Epoch 7/12
17/17 [==============================] - 8s 482ms/step - loss: 0.0020 - val_loss: 0.0028
Epoch 8/12
17/17 [==============================] - 8s 482ms/step - loss: 0.0019 - val_loss: 0.0027
Epoch 9/12
17/17 [==============================] - 8s 482ms/step - loss: 0.0018 - val_loss: 0.0026
Epoch 10/12
17/17 [==============================] - 8s 485ms/step - loss: 0.0017 - val_loss: 0.0026
Epoch 11/12
17/17 [==============================] - 8s 485ms/step - loss: 0.0015 - val_loss: 0.0023
Epoch 12/12
17/17 [==============================] - 8s 484ms/step - loss: 0.0011 - val_loss: 0.0020

并且模型也保存在了 illu_v01 目录。

▸ ls illu_v01/
assets  fingerprint.pb  keras_metadata.pb  saved_model.pb  variables

模型测试

现在有可模型，下面就是测试下自己的模型，使用下述 python 代码在 PC 端进行测试：

#!/usr/bin/python3.11

import numpy as np
import os
import PIL
import PIL.Image
import tensorflow as tf
import pathlib
import csv
import pandas as pd
import tensorflow.data
import sys
import matplotlib.pyplot as plt

IMG_WIDTH=255
IMG_HEIGHT=255

reload_model=tf.keras.models.load_model("illu_v01")
image_path=r'./JP/a0001-jmac_DSC1459.jpeg'
if len(sys.argv) < 2:
    print('Please input some pic to predict')
    sys.exit()
else:
    image_path=sys.argv[1]


image = tf.io.read_file(image_path)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [IMG_WIDTH, IMG_HEIGHT])
image = tf.reshape(image, [1, IMG_WIDTH, IMG_HEIGHT, 3])

#  sys.exit()

predictions=reload_model.predict(image)
print(f'{image_path} ans={predictions*16777215}')

简单测试下模型：

check_tf2.py JP/a0001-jmac_DSC1459.jpeg
2024-08-08 13:57:08.263506: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-08 13:57:08.264895: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-08 13:57:08.285614: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-08 13:57:08.285646: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-08 13:57:08.286510: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-08 13:57:08.290464: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-08 13:57:08.290608: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-08 13:57:08.725843: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-08-08 13:57:11.051710: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-08-08 13:57:11.051982: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
1/1 [==============================] - 0s 57ms/step
JP/a0001-jmac_DSC1459.jpeg ans=[[check_tf2.py JP/a0001-jmac_DSC1459.jpeg
2024-08-08 13:57:08.263506: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-08 13:57:08.264895: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-08 13:57:08.285614: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-08 13:57:08.285646: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-08 13:57:08.286510: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-08 13:57:08.290464: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-08-08 13:57:08.290608: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-08 13:57:08.725843: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-08-08 13:57:11.051710: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-08-08 13:57:11.051982: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
1/1 [==============================] - 0s 57ms/step
JP/a0001-jmac_DSC1459.jpeg ans=[[5459503.]].]]

发现估计的光照度值是 5459503 和实际的 5363799 对比一下还是有15%左右的误差。但是目前为止，整个模型训练测试流程已经完成，下一步在是PC端模拟拉流使用模型对图像进行实时计算了，期待哦。

审核编辑黄宇

打开APP阅读更多精彩内容