1500字范文 > Python如何调用系统视像头进行人脸识别图像处理图像识别等常用库教程（基础）

Python如何调用系统视像头进行人脸识别图像处理图像识别等常用库教程（基础）

时间：2019-02-08 11:29:03

01、利用python操作摄像头

首先介绍安装opencv-python第三方库

OpenCV 是一个开源的计算机视觉库，OpenCV 库用C语言和 C++ 语言编写，可以在 Windows、Linux、Mac OS X 等系统运行。同时也在积极开发 Python、Java、Matlab 以及其他一些语言的接口，将库导入安卓和 iOS 中为移动设备开发应用。

OpenCV 库包含从计算机视觉各个领域衍生出来的 500 多个函数，包括工业产品质量检验、医学图像处理、安保领域、交互操作、相机校正、双目视觉以及机器人学。

首先我们来引入我们需要的模块：pip install opencv-python

图片像素矩阵读写的原理

import cv2image = cv2.imread("image/test.jpeg")cv2.imshow("window", image)

因为程序一旦停止运行，图片就不会展示了，所以会出现一闪而过的窗口展示，所以为了让图片长时间展示出来，那么需要加：cv2.waitKey()

cv库中的函数cv.image读取的是图片的像素矩阵，矩阵单元是rbg的向量形式。下面举例读取纯色图片来解释原理情况：

import cv2image = cv2.imread("image/test_98_98.png")# 返回矩阵矩阵的每个是rgb行向量，[r,g, b]"""· The function imread loads an image from the specified file and returns it. If the image cannot be. read (because of missing file, improper permissions, unsupported or invalid format), the function. returns an empty matrix ( Mat::data==NULL )."""print(len(image)) # 像素：高print(len(image[0])) # 像素：宽print(image) # 像素矩阵（3维列表）cv2.imshow("window", image)cv2.waitKey(0)

三维列表：最外维是高，中间维度是宽，最里面的维度是rgb

就比如读取一张纯色(40,44,52)的png图片，发现矩阵的每个是rgb行向量都是相同的。

而我们采用传统的文件读取方式，读出结果都是二进制的格式：

with open('./image/test_98_98.png', 'rb') as f:print(f.read())

开始进行代码编写测试

android手机上安装一款APP：IP摄像头

如果应用商店没有，那么打开手机百度进行下载，安装成功后，点击软件界面下方“打开IP摄像头服务器”，选择局域网的IP地址。

要想连接成功，必须保持手机与电脑处于同一局域网下，例如同一热点，同一WIFI。

比如先调用电脑本地的摄像头做个预测试：

import cv2capture = cv2.VideoCapture(0)# 0为电脑内置摄像头while True:ret, frame = capture.read()# 摄像头读取, ret为是否成功打开摄像头, true, false：frame为视频的每一帧图像frame = cv2.flip(frame, 1)# 摄像头是和人对立的，将图像左右调换回来正常显示。cv2.imshow("video", frame)c = cv2.waitKey(50)if c == 27: # 27 对应是 esc 键break

关于函数waitKey(delay=None)的介绍：@param delay 参数，等待的时长，会发生同步阻塞，单位是milliseconds毫秒，如果传值为0，那么就是永久阻塞直到键盘事件发生。

关于函数flip(src, flipCode, dst=None)的介绍：@param flipCode 参数，翻转码，分别有三种取值：大于0，小于0，等于0，下面的源码注释也详细地介绍了。

def waitKey(delay=None): # real signature unknown; restored from __doc__# @param delay Delay in milliseconds. 0 is the special value that means "forever".def flip(src, flipCode, dst=None): # real signature unknown; restored from __doc__""". The function cv::flip flips the array in one of three different ways (row. and column indices are 0-based):. The example scenarios of using the function are the following:. * Vertical flipping of the image (flipCode == 0) to switch between. top-left and bottom-left image origin. This is a typical operation. in video processing on Microsoft Windows\* OS.. * Horizontal flipping of the image with the subsequent horizontal. shift and absolute difference calculation to check for a. vertical-axis symmetry (flipCode \> 0).. * Simultaneous horizontal and vertical flipping of the image with. the subsequent shift and absolute difference calculation to check. for a central symmetry (flipCode \< 0).. * Reversing the order of point arrays (flipCode \> 0 or. flipCode == 0)."""

手机端连接的操作

这样我们就将视频引入进来了，当然你可以换一个网络视频地址或者本地视频地址，把它变成视频播放器，然后我们就需要去读取我们引入的视频地址。

网络视频地址比如：https://klxxcdn.oss-cn-/histudy/hrm/media/bg3.mp4

既然实时的，而且要长时间运行，那当然少不了while true

import cv2 # 导入库cv2.namedWindow("camera", 1) # 定义启动窗口名称video = "http://admin:admin@192.168.0.101:8081/"# 此处根据IP摄像头生成的局域网地址capture = cv2.VideoCapture(video)# 引入视频地址，video其实也可以换成你电脑中的视频地址可以制作成一个播放器。num = 0while True:success, img = capture.read() # 读取视频img = cv2.flip(img, 1)cv2.imshow("camera", img)key = cv2.waitKey(10)if key == 27: # esc键退出breakif key == ord(' '): num = num + 1filename = "frames_%s.jpg" % numcv2.imwrite(filename, img) # 保存一张图像capture.release()# The method is automatically called by subsequent VideoCapture::open and by VideoCapture destructor.cv2.destroyWindow("camera")# The function destroyWindow destroys the window with the given name.

运行结果如下所示：

cv2.imwrite(filename, img)保存一张图像，filename传文件的地址值和文件名称

在Windows中用python处理图像时遇到问题-!_src.empty() in function 'cv::cvtColor'

在运行时报错，根据显示，应该是没有对cvtColor传入源图像。逐步检查：

文件路径正确，是绝对路径，文件名中有中文，最后是因为文件名中有中文，将处理后文件进行保存后发现英文文件名的图像正常，而中文错误。

02、opencv实现人脸识别

梳理一下实现人脸识别需要进行的步骤：

流程大致如此，在此之前，要先让人脸被准确的找出来，也就是能准确区分人脸的分类器，在这里我们可以用已经训练好的分类器，网上种类较全，分类准确度也比较高，我们也可以节约在这方面花的时间。

下载人脸检测xml文件

需要下载人脸模型库文件“ haarcascade_frontalface_default.xml ”，帮助摄像头获取的画面去对比，下载成功后，将“haarcascade_frontalface_default.xml”文件放在上面的代码文件目录里。

链接：/s/1lxZrI9ZjXWreJvKPYgyQRQ 提取码：w96c

人脸位置检查代码展示如下：

import cv2# 读取视频信息。cap = cv2.VideoCapture("http://admin:admin@192.168.0.101:8081/") # @前为账号密码，@后为ip地址face_xml = cv2.CascadeClassifier("haarcascade_frontalface_default.xml") # 导入XML文件while cap.isOpened():f, img = cap.read() # 读取一帧图片img = cv2.flip(img, 1) # 镜像翻转图片gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # 转换为灰度图face = face_xml.detectMultiScale(gray, 1.3, 10) # 检测人脸，并返回人脸位置信息for (x, y, w, h) in face:cv2.rectangle(img, (x, y), (x + w, y + h), (255, 255, 255), 2)# x、y 位置坐标值，w 、h矩形框的大小：width、height# 最后参数 2 的意思是，矩形框的border（前端css中的div border类似）值# (255, 255, 255) 捕捉人脸位置的矩形框颜色cv2.imshow("camera", img)if cv2.waitKey(10) == 27:breakcap.release()

实验结果是：在人脸的位置出现白色的矩形框：就好比这样，由于博主不想露脸，效果就像如下所示的那样。

03、基于pillow图像处理

Python Pillow(PIL)库的用法介绍：Pillow库是一个Python的第三方库

在Python2中，PIL(Python Imaging Library)是一个非常好用的图像处理库，但PIL不支持Python3，所以有人(Alex Clark和Contributors)提供了Pillow，可以在Python3中使用。

官方文档路径：https://pillow.readthedocs.io/en/latest/

Python Pillow(PIL)库使用

Pillow库安装成功后，导包时要用PIL来导入，而不能用pillow或Pillow：

import PILfrom PIL import Image

在Pillow库中，除了有二十多个模块，还支持非常多的插件。其中最常用的是Image模块中同名的Image类，其他很多模块都是在Image模块的基础上对图像做进一步的特殊处理，Image模块中会导入部分来使用。本文介绍Image模块的常用方法。

常见的函数介绍：

open(fp, mode='r'): 打开一张图片，是Image模块中的函数。如果图片与当前代码在同一目录下，可以只写图片的文件名，其他情况需要拼接图片的路径。mode默认为'r'，也必须为'r'。

show(): 调用图片显示软件打开图片。打开后程序会阻塞，需要手动关闭。

创建一张新图片：

from PIL import Imageimage = Image.new('RGB', (250, 250), (0, 0, 250))image.show()"""new(mode, size, color=0): 创建一张图片(画布)，用于绘图，是Image模块中的函数。有3个参数。mode, 图片的模式，如“RGB”(red,green,blue三原色的缩写，真彩图像)、“L”(灰度，黑白图像)等。size, 图片的尺寸。是一个长度为2的元组(width, height)，表示的是像素大小。color, 图片的颜色，默认值为0表示黑色。可以传入长度为3的元组表示颜色，也可以传入颜色的十六进制，在版本1.1.4后，还可以直接传入颜色的英文单词，如上面代码中的(0, 0, 255)可以换成‘#0000FF’或‘blue’，都是表示蓝色。"""

Image模块的常用属性

width属性表示图片的像素宽度，height属性表示图片的像素高度，width和height组成了size属性，size是一个元组。

mode属性表示图片的模式，如RGBA，RGB，P，L等。

format属性表示图片的格式，格式一般与图片的后缀扩展名相关。category属性表示图片的的类别。

readonly属性表述图片是否为只读，值为1或0，表示的是布尔值。

info属性表示图片的信息，是一个字典。

图片的裁剪和缩放

from PIL import Imageimage = Image.open("唤醒手腕.jpg")image_crop = image.crop(box=(300, 300, 800, 700))# image_crop.show()print('before resize: ', image.size)image_resize = image.resize((500, 400), resample=Image.LANCZOS, box=(100, 100, 1200, 800), reducing_gap=5.0)print('after resize: ', image_resize.size)image_resize.show()

上述代码中的相关参数介绍：

crop(box=None): 裁剪图片，返回裁剪区域的图片。box表示裁剪的区域，传入长度为4的元组(x0, y0, x1, y1)，不传默认为拷贝原图，相当于copy()方法，如果裁剪的区域超过了原图的区域，超出部分用像素格填充。resize(size, resample=BICUBIC, box=None, reducing_gap=None): 缩放图片，返回缩放后的图片副本。有4个参数。size, 图片缩放后的尺寸，传入一个长度为2的元组(width, height)。resample, 重采样，是一个可选的重采样过滤器。可以传入Image.NEAREST, Image.BOX, Image.BILINEAR, Image.HAMMING, Image.BICUBIC, Image.LANCZOS。默认为Image.BICUBIC。如果图像的模式为'1'或'P'，则始终设置为Image.NEAREST。box, 缩放图片的区域。传入长度为4的元组(x0, y0, x1, y1)，这个区域必须在原图的(0, 0, width, height)范围内，如果超出范围会报错，如果不传值则默认将整张原图进行缩放。reducing_gap, 减少间隙。传入一个浮点数，用于优化图片缩放效果，默认不进行优化，值大于3.0时优化效果基本已经是公平的重采样。

案例测试：进行图片的调整尺寸，批量处理

from PIL import Image# PIL 库集成在 pillow 库里。# pip install pillow 安装后可以使用 PIL 库了。for index in range(1, 41):img = Image.open(f"./wrist_tang/mini_img_{str(index).zfill(2)}.jpeg")img_deal = img.resize((300, 300), Image.ANTIALIAS) # 转化图片"""Returns a resized copy of this image.:param size: The requested size in pixels, as a 2-tuple: (width, height)."""img_deal = img_deal.convert('RGB')# 因为默认颜色的属性是 RGBA 和 RGB 的区别是前者多了透明度的设置# 保存为.jpg格式才需要这句话的意思就是把颜色属性 RGBA 改为 RGB。try:img_deal.save(f"mini_tang/mini_img_{str(index).zfill(2)}.jpeg")except IOError:print("cannot save")

案例测试：进行图片的添加文字，制作表情包

from PIL import Image, ImageDraw, ImageFontdef addTransparency(img, factor=0.7):img = img.convert('RGBA')img_blender = Image.new('RGBA', img.size, (0, 0, 0, 0))img = Image.blend(img_blender, img, factor)return imgpng = Image.open("img.png")""":returns: An :py:class:`~PIL.Image.Image` object."""bottom_back = Image.open('bottom_filter.png')bottom_back = addTransparency(bottom_back, factor=0.8)png.paste(bottom_back, (0, 250), bottom_back)# 产生以下图像(红色像素的叠加部分完全取自第二层。像素未正确混合)：draw = ImageDraw.Draw(png)ttfront = ImageFont.truetype('STXINGKA.TTF', 22) # 字体大小draw.text(xy=(10, 262), text="我特别喜欢你，特别喜欢", embedded_color=(110, 25, 25), fill=(15, 15, 15), font=ttfront)# 文字位置，内容，字体# draw.text() xy: tuple[float, float] 植入的颜色png.save("emoji_picture.png")

04、基于pyautogui图像识别

获取屏幕截图

我们控制鼠标的操作，不能盲目的进行，所以我们需要监控屏幕上的内容，从而决定要不要进行对应的操作， pyautogui 提供了一个方法screenshot()，可以返回一个Pillow的image对象；

这里有三个常用函数：

im = pyautogui.screenshot()：返回屏幕的截图，是一个Pillow的image对象

im.getpixel((500, 500))：返回im对象上，（500，500）这一点像素的颜色，是一个RGB元组

pyautogui.pixelMatchesColor(500,500,(12,120,400))：是一个对比函数，对比的是屏幕上（500，500）这一点像素的颜色，与所给的元素是否相同；

im = pyautogui.screenshot()im.save('屏幕截图.png')

识别图像

首先，我们需要先获得一个屏幕快照，例如我们想要点赞，我们就先把大拇指的图片保存下来；

然后使用函数：locateOnScreen(‘dst_im.png’)

如果可以找到图片，则返回图片的位置，如：Box(left=25, top=703, width=22, height=22)

如果找不到图片，则返回None

如果，屏幕上有多处图片可以匹配，则需要使用locateAllOnScreen(‘dst_im.png’)，如果匹配到多个值，则返回一个list，参考如下：

import pyautoguipyautogui.PAUSE = 1# 图像识别（一个）btm = pyautogui.locateOnScreen('dst_im.png')print(btm) # Box(left=1280, top=344, width=22, height=22)# 图像识别（多个）btm = pyautogui.locateAllOnScreen('dst_im.png')print(list(btm)) # [Box(left=1280, top=344, width=22, height=22), Box(left=25, top=594, width=22, height=22)]

pyautogui.center((left, top, width, height))返回指定位置的中心点；这样，我们就可以再配合鼠标操作点击找到图片的中心

QQ自动登录案例

键盘操作的函数：keyDown(key)键(str)：要按下的键。有效的名称列在KEYBOARD_KEYS中。

源码中已经标注了所有可供选择的键，展示如下所示：

当然如果想要输入大写的字母，或者一些转换键的字符（@、#、$），直接作为参数传入即可，底层源码已经做了转换如下所示，这边就不具体分析了。

底层调用的是shift键的转换，实现大写字母的keyDown效果。

QQ登录，代码编写测试

首先准备两张png的图片，分别是QQ在桌面的图标，QQ客户端安全登录按钮的图标，展示如下所示：

因为QQ客户端启动需要时间，time.sleep(2)等待QQ客户端响应的滞留时间

import timeimport pyautoguipyautogui.PAUSE = 1# 图像识别（一个）桌面QQ的图标qq_logo = pyautogui.locateOnScreen('qq_logo.png')print(qq_logo)print(pyautogui.center(qq_logo))pyautogui.doubleClick(pyautogui.center(qq_logo))time.sleep(2) # 等待QQ客户端响应的滞留时间passwd = "你的QQ密码"for letter in passwd:pyautogui.keyDown(letter)# 图像识别（一个）QQ客户端安全登录的按钮safe_login_btn = pyautogui.locateOnScreen('qq_safe_login.png')print(safe_login_btn)print(pyautogui.center(safe_login_btn))pyautogui.click(pyautogui.center(safe_login_btn))

上面的测试是用在二次登录的状况，二次登录QQ（记住账号的历史模式）只需要输入密码即可，启动完QQ客户端，光标自动定位到输入密码的状态栏。

自动关闭电脑：调用pyautogui，其实也是种伪自动的方式：

import timeimport pyautogui# 打开window 菜单栏pyautogui.click(x=30, y=1060)# 点击关键菜单栏pyautogui.click(x=30, y=1000)# 滞留延迟time.sleep(0.1)# 自动关机pyautogui.click(x=30, y=900)

05、制作gui桌面应用程序

当使用桌面应用程序的时候，有没有那么一瞬间，想学习一下桌面应用程序开发？行业内专业的桌面应用程序开发一般是C++、C#来做，Java开发的也有，但是比较少。

Python的GUI（图形用户界面）编程，用Python也可以写出漂亮的桌面程序，建议此次课程大家稍作了解不要浪费太多时间，因为没有哪家公司会招聘以为Python程序员开发桌面程序吧？

首先看一下目前有哪些主流的GUI平台

Tkinter: 使用Tk平台，支持大多数的Unix系统，同时可以在Windows和Mac下运行,是Python的标准界面库，但是由于界面丑陋，文档极差而被开发者吐槽。

wxpython：拥有成熟和丰富的包，跨平台，可在Unix , Windows , Mac下运行，入门简单，文档写的很详细，再加上官方的Demo大全，确实给初学者降低难度，可以作为入门学习的首选库。

PythonWin: 只能在Windows上使用，调用windows的GUI，如果要做跨平台的应用程序，显然不会选择它。

PyGTK：使用GTK平台，Linux系统上使用较多，跨平台。

PyQt: 优点界面美观，多个平台，文档和教程丰富。但是商业化使用有版权的问题，需要授权，体积相对较大。

基于wxpython GUI平台开发测试

安装 wxpython 第三方的库

pip3 install -U wxPython

安装好界面库以后，我们先实现入门程序的制作：

# 先创建一个程序app = wx.App()# 创建完程序，下一步就是创建一个窗口（Frame)win = app.Frame(None)# 创建完成窗口，我们想要显示结果怎么办？（Show）win.show()# 最后运行主程序MainLoop()app.MainLoop()

官方文档的网站地址：/

Every programming language and UI toolkit needs to have a Hello World example. I think it’s the law in most jurisdictions. Their intent is obviously to tell you everything you need to know in order to select the language or toolkit for your own use. So, here is wxPython’s Hello World:

他们的目的显然是告诉你你需要知道的一切，以便选择你自己使用的语言或工具包。

# first import wxPython packageimport wx# secondly. create a Application objectapp = wx.App()""" Construct a ``wx.App`` object. """# create a framefrm = wx.Frame(None, title="hello world")# param: 'title' is the gui figure's name"""def __init__(self, parent=None, id=None, title=None, pos=None, size=None, style=None, name=None): pass# real signature unknown; restored from __doc__ with multiple overloads"""frm.Show() # show it# Start the event loop. 事件循环app.MainLoop()

Five lines of code to create and show a window, and run an event handler. That’s really all it takes.

What, you think 5 lines is too many? Okay, fine. Here it is in one line 😛 :

import wx; a=wx.App(); wx.Frame(None, title="Hello World").Show(); a.MainLoop()

Okay, now let’s put a little more flesh on the bones of that Hello World sample to give a little better idea of what creating a wxPython application is all about. The finished application looks like these screenshots when run:

好了，现在让我们在HelloWorld示例的骨骼上多加一些肉，以便更好地了解创建wxPython应用程序是什么。完成的应用程序在运行时看起来像这些屏幕截图：

And here is the source code. The docstrings and the comments in the code will help you understand what it is doing.

然后这边就是源代码，代码中的文档字符串和其中的注释将帮助您理解它在做什么。

#!/usr/bin/env python"""Hello World, but with more meat."""import wxclass HelloFrame(wx.Frame):"""A Frame that says Hello World"""def __init__(self, *args, **kw):# ensure the parent's __init__ is calledsuper(HelloFrame, self).__init__(*args, **kw)# create a panel in the frame 在框架中创建一个面板pnl = wx.Panel(self)# put some text with a larger bold font on it 在他上面放一些加粗的文字st = wx.StaticText(pnl, label="Hello World!")font = st.GetFont()font.PointSize += 10font = font.Bold()st.SetFont(font)# and create a sizer to manage the layout of child widgetssizer = wx.BoxSizer(wx.VERTICAL)sizer.Add(st, wx.SizerFlags().Border(wx.TOP|wx.LEFT, 25))pnl.SetSizer(sizer)# create a menu barself.makeMenuBar()# and a status barself.CreateStatusBar()self.SetStatusText("Welcome to wxPython!")def makeMenuBar(self):"""A menu bar is composed of menus, which are composed of menu items.This method builds a set of menus and binds handlers to be calledwhen the menu item is selected.菜单栏由菜单组成，菜单由菜单项组成。此方法构建一组菜单，并绑定在选择菜单项时调用的处理程序。"""# Make a file menu with Hello and Exit itemsfileMenu = wx.Menu()# The "\t..." syntax defines an accelerator key that also triggers# the same eventhelloItem = fileMenu.Append(-1, "&Hello...\tCtrl-H","Help string shown in status bar for this menu item")fileMenu.AppendSeparator()# When using a stock ID we don't need to specify the menu item's# labelexitItem = fileMenu.Append(wx.ID_EXIT)# Now a help menu for the about itemhelpMenu = wx.Menu()aboutItem = helpMenu.Append(wx.ID_ABOUT)# Make the menu bar and add the two menus to it. The '&' defines# that the next letter is the "mnemonic" for the menu item. On the# platforms that support it those letters are underlined and can be# triggered from the keyboard.menuBar = wx.MenuBar()menuBar.Append(fileMenu, "&File")menuBar.Append(helpMenu, "&Help")# Give the menu bar to the frameself.SetMenuBar(menuBar)# Finally, associate a handler function with the EVT_MENU event for# each of the menu items. That means that when that menu item is# activated then the associated handler function will be called.self.Bind(wx.EVT_MENU, self.OnHello, helloItem)self.Bind(wx.EVT_MENU, self.OnExit, exitItem)self.Bind(wx.EVT_MENU, self.OnAbout, aboutItem)def OnExit(self, event):"""Close the frame, terminating the application."""self.Close(True)def OnHello(self, event):"""Say hello to the user."""wx.MessageBox("Hello again from wxPython")def OnAbout(self, event):"""Display an About Dialog"""wx.MessageBox("This is a wxPython Hello World sample","About Hello World 2",wx.OK|wx.ICON_INFORMATION)if __name__ == '__main__':# When this module is run (not imported) then create the app, the# frame, show it, and start the event loop.app = wx.App()frm = HelloFrame(None, title='Hello World 2')frm.Show()app.MainLoop()

这边介绍下，super().__init__()方法的作用：

class Person(object):def __init__(self,name,gender,age):self.name = nameself.gender = genderself.age = ageclass Student(Person):def __init__(self,name,gender,age,school,score):super(Student,self).__init__(name,gender,age)self.school = schoolself.age = ages = Student('Alice','female',18,'Middle school',87)print (s.school) # Middle schoolprint (s.name) # Alice