项目介绍
项目是基于MAX78000的语音识别智能照相机,专注于提供一种盲拍的拍摄体验。这款照相机没有预览功能,用户只能在拍摄后通过查看照片来发现意外的惊喜。通过MAX78000芯片的低功耗和AI加速器的强大计算能力,我们实现了一个既智能又神秘的便携式智能照相机,用户可以通过语音指令轻松控制拍摄操作,并且能在拍照的过程中充满了期待与惊喜。
(使用语音控制摄像头拍下照片存入SD卡内)
项目设计思路
搜集素材的思路
MAX78000入门(win下 VSCode+Anaconda+Pycharm)
0.1 所需软件列表
链接:https://pan.baidu.com/s/1YX1EqE9mPjTdhPpuVd7L5w?pwd=zhnb
提取码:zhnb
- MaximMicrosSDK_win (官方工具包)
- VScode (硬件编程及烧录)
- git(拉取官方提供代码)
- Anaconda+PyCharm (软件编程及模型训练)
(详细版见附件“MAX78000入门(win下 VSCode+Anaconda+Pycharm).html”)
预训练实现过程
教程:https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/Guides/Making%20Your%20Own%20Audio%20and%20Image%20Classification%20Application%20Using%20Keyword%20Spotting%20and%20Cats-vs-Dogs.md
1、放入数据集
2、修改相关代码
预训练生成的log请见附件(”2023.10.07-223536.log“)
接着经过量化、评估、生成硬件编程可调用的库
编译烧录后 串口调试工具查看效果
实现结果展示
板子识别到KACA关键字后可在串口调试助手上观察到:
[2023-12-16 22:53:24.768]# RECV ASCII>
Word starts from index 14464 to 5760, padded with 4864 zeros, avg:489 > 350
324608: Starts CNN: 11
324608: Completes CNN: 11
CNN Time: 1850 us
Min: -84, Max: 60
-----------------------------------------
Detected word: Unknown
-----------------------------------------
[2023-12-16 22:53:26.548]# RECV ASCII>
Word starts from index 12928 to 5120, padded with 3968 zeros, avg:351 > 350
346112: Starts CNN: 12
346112: Completes CNN: 12
CNN Time: 1850 us
Min: -58, Max: 45
-----------------------------------------
.
Error creating directory: 268703948
.
photo_1------->1
photo_2------->2
photo_3------->3
photo_4------->4
photo_5------->5
Configuring camera
Starting streaming capture...
Done! (Took 41685 us)
DMA transfer count = 240
OVERFLOW = 0
Saving image to /PHOTO/photo_6
0.0%
3.3%
6.7%
10.0%
13.3%
16.7%
20.0%
23.3%
26.7%
30.0%
[2023-12-16 22:53:26.642]# RECV ASCII>
33.3%
36.7%
[2023-12-16 22:53:26.706]# RECV ASCII>
40.0%
[2023-12-16 22:53:26.770]# RECV ASCII>
43.3%
[2023-12-16 22:53:26.832]# RECV ASCII>
46.7%
[2023-12-16 22:53:26.944]# RECV ASCII>
50.0%
53.3%
[2023-12-16 22:53:26.991]# RECV ASCII>
56.7%
[2023-12-16 22:53:27.102]# RECV ASCII>
60.0%
63.3%
[2023-12-16 22:53:27.210]# RECV ASCII>
66.7%
70.0%
[2023-12-16 22:53:27.258]# RECV ASCII>
73.3%
[2023-12-16 22:53:27.305]# RECV ASCII>
76.7%
[2023-12-16 22:53:27.400]# RECV ASCII>
80.0%
83.3%
[2023-12-16 22:53:27.460]# RECV ASCII>
86.7%
[2023-12-16 22:53:27.696]# RECV ASCII>
90.0%
93.3%
96.7%
Finished (took 1437253us)
ANALOG DEVICES
Keyword Spotting Demo
Ver. 3.2.3 (5/05/23)
***** Init *****
pAI85Buffer: 16384
Detected word: kaca (97.0%)
-----------------------------------------
图片保存在了SD卡下的/PHOTO/文件夹,使用“图片查看器”小程序转换图片格式:
遇到的主要难题及解决方法:
板子主程序里的拍照和关键字识别功能都用到了硬件CNN,导致图片尺寸过大时会错位,效果如下:
解决方案:在确保图片不发生错位的情况下,让尺寸尽可能的大。最后确定图片尺寸为320X240
相关代码:https://github.com/BOOME4/MAX78000.git
参考材料:
- 硬件开发方面
- 总资料:Analog Devices AI (github.com)
- 1、训练:
- https://github.com/MaximIntegratedAI/MaximAI_Documentation/blob/master/Guides/Making%20Your%20Own%20Audio%20and%20Image%20Classification%20Application%20Using%20Keyword%20Spotting%20and%20Cats-vs-Dogs.md
- 2、demo及思路:
- msdk/Examples/MAX78000 at main · Analog-Devices-MSDK/msdk (github.com)
- MAX78000 语音识别 - 电子森林 (eetree.cn)
- 软件开发方面:
- 1、ttkbootstrap教程
- 教程 - ttkbootstrap
- 2、图片展示部分
- ttkbootstrap图片显示器_bootstrap图片查看器-CSDN博客
- 3、python脚本打包成.exe程序(环境换成了Anaconda)
- 【花了一周终于整理好了!全网最全Python tkinter教程!包含所有知识点!轻松做出好看的tk程序!】