In order to solve the challenge of extracting stable features of small groups of people due to light changes in cross vision scenes, a light invariant optical flow preprocessing algorithm based on weighted regularization transformation was proposed. This algorithm can obtain light invariant optical flow estimation results in cross vision light changing scenes; To solve the problems of the lack of large-scale data sets for small group research in visible light scenes, as well as the differences in group characteristics caused by changes in visual fields, Through extensive recruitment of volunteers, a single mode visible light SYSU-Group dataset was constructed, and a twin network based SVIGR algorithm was proposed; To solve the time domain synchronization problem of multi perspective small group video, A multi perspective style transfer method based on self-supervised learning is proposed, which monitors the key points of learning objectives through potential correspondence between perspectives. The relevant results are as follows:
Low-light images have always had the problems of low visibility and high noise, and the corresponding enhancement method is an ill-posed problem, mainly because there are many possibilities for mapping the enhanced output results. The processing of low-light images has been a hot topic in academia and industry in terms of providing better visualization for humans and revealing details for machine vision applications. To this end, the team nonlinearly maps the denoised low-light reconstructed image and illumination components into high-quality enhanced images through BTF according to the ideal exposure state. The research results have a wide range of application scenarios, such as machine vision, object tracking, and pedestrian re-identification and HDR reconstruction etc.
The problem of noise amplification and loss of details is faced in the process of defogging a single image. To this end, the team decomposes the image into different scales, and then uses the Gaussian filtered image to calculate the atmospheric light component to avoid the influence of noise. For the calculation of the transmittance, the initial value is firstly calculated through the dark pass prior, and then the initialized transmittance is refined using the prior knowledge of the non-local fog line, and then WFIG is used to further suppress the transmittance image noise. Finally, the multi-scale strategy is used to restore the image, which ensures the details of the dehazed image. A large number of experiments show that the proposed algorithm has obvious advantages in noise suppression, detail enhancement, color fidelity and so on.
The rain removal algorithm based on image decomposition is easy to cause the loss of image details, which directly restricts the implementation of high-level vision tasks. For this reason, the team introduced the idea of denoising into the rain removal task based on the similarity between denoising and rain removal tasks. According to the attribute of local similarity in the internal blocks of the image, the network structure is designed using the idea of Non-Local, and the similar blocks are extracted for fusion and rain removal. A large number of experiments show that the proposed algorithm retains the details of the image while removing the raindrops.
This research direction uses image modeling, image mosaic, image fusion, image registration and other means to study the method of high dynamic video. The main research contents are: 1) Camera response function: The irradiance of the image and the irradiance of the scene have a nonlinear relationship. 2) Moving target detection: process a large amount of video data in real time, analyze, locate and segment the target of interest, track the detected moving target, and analyze and identify the behavior of the target through tracking. 3) Image registration: matching and superimposing multiple images acquired at different times, different imaging devices or under different conditions. 4) Tone mapping: Perform a large contrast attenuation to change the brightness of the scene to a displayable range, while maintaining image details and colors, so that the obtained high-dynamic images can be displayed normally on the display. 5) Image synthesis: synthesize the obtained high dynamic background image and high dynamic target image to produce the result image that best meets the requirements.
Foreground and background separation of video has always had changes in ambient lighting (sudden and slow changes in lighting), multimodality of the background (subtle movements in the background will affect the results of foreground target detection), shadows of moving objects , noise, new immobile objects entering the background (how to quickly adapt to background changes), how to accurately separate the background has always been a difficult problem. To this end, according to the low-rank characteristics of the video background, the team proposed a low-rank decomposition algorithm based on rank estimation to achieve effective separation of the video foreground and background. The research results have a wide range of applications, such as target recognition, target tracking, pedestrian re-identification, etc.
Based on the theory of fuzzy neural network, this research direction is used to solve the problem of how to quickly and automatically construct an effective fuzzy neural network in the absence of comprehensive knowledge of fuzzy theory, neural network and application objects. The main research contents include: 1) The structure of dynamic fuzzy neural network. 2) The structure and parameters are determined simultaneously. 3) The learning method of dynamic fuzzy neural network: rule generation criterion, hierarchical learning idea, premise parameter assignment, result parameter determination, pruning technique, and structure identification and division of input space. 4) Realization of different algorithms of dynamic fuzzy neural network: singular value decomposition method, eigenvalue decomposition method, column pivot method and total least squares method in pruning technology, and extended Kalman filter method in parameter adjustment method.
This research direction is based on the theory of pattern recognition, using artificial intelligence technology, image enhancement, deep learning and other means to study methods to meet special performance requirements. The main research contents are: 1) Recognition of face, gender, age, etc.: process the face images imaged under different lighting conditions, eliminate the influence of light, enhance the local texture information of the face, and improve the recognition accuracy in face recognition applications. The gender is judged according to the input face image, and the FLD method is improved so that the ratio of the class distance to the intra-class distance is infinite, so as to obtain the most effective gender discrimination feature without intersection. Determine the age range of the input face image. 2) Character recognition: According to the characteristics of characters, design and research reasonable image preprocessing algorithms, study the normalization and refinement processing methods of characters, and establish character standard feature databases.
Eye tracking is an important means to capture visual attention and then interpret intentions. This subject designs a head-mounted eye tracking system for the limitations of current EMG and EEG signals: (1) A robust pupil detection algorithm is proposed; (2) The hardware is designed and implemented; (3) The corresponding software is developed using Qt, including the necessary software functions; (4) Completed the necessary algorithms required by the system, such as personal calibration, algorithms from gaze direction to gaze point, reflective point detection, and target reconstruction. Through the above research, it provides a theoretical basis and scientific basis for human-computer interaction driven by human vision, and enriches the connotation of natural human-computer interaction, which has important academic significance and application value.
With the rapid development of the domestic economy, great changes have taken place in the logistics industry. The traditional manual sorting cannot meet the market demand, and the sorting business is developing towards the direction of automatic sorting. The existing intelligent sorting system has shortcomings such as low accuracy, long time consumption, and poor applicability, and cannot meet the increasingly complex actual industrial scenarios. To this end, the team optimized the algorithm of the hand-eye calibration, pose estimation, path planning and other links of the intelligent sorting system, and realized high-precision and fast intelligent sorting in the industrial Bin-Picking scene, which provided a basis for the design of intelligent sorting systems at home and abroad. A new technical solution is provided. The research results have broad application prospects in logistics sorting, automobile assembly and other fields.
SLAM is synchronous positioning and mapping, which can build a map of the surrounding environment and estimate its own motion during the camera movement. SLAM algorithms have broad application prospects in unmanned driving, service robots, surveying and mapping, AR and other scenarios. Traditional visual SLAM algorithms track based on the assumption that the environment is static, which is less robust in dynamic environments and does not make full use of image information. To this end, the team combined semantic segmentation technology with SLAM to build a semantic map of the environment while improving the robustness of the algorithm to help the robot better understand the surrounding environment.
With the rapid development of science and technology today, three-dimensional information has gradually become a means for human beings to understand and transform the world. As a 3D reconstruction method with a wide range of applications and deep theoretical research, structured light has always been a research hotspot. The structured light system pursues high precision, high efficiency and low cost, so the team proposed efficient and robust improved algorithms for the calibration method, pattern coding, and phase unwrapping in the measurement of the structured light system. For the reconstruction of structured light in special environments, the team proposed solutions for highly reflective parts and specularly reflective objects. It is of great significance to the industrialization of factories. The Research results have great development prospects and uses in industrial testing and consumer entertainment.
Using image processing and pattern recognition theories and methods, combining traditional methods with deep learning, researching character recognition methods in natural scenes, and developing embedded character recognition systems based on Internet of Things technology. It has broad application prospects in detonator code recognition, instrument reading recognition/remote meter reading, license plate recognition, container number recognition, natural scene text recognition, etc.
Using Internet of Things technology and machine vision technology to collect image or video data, intelligently detect specific targets and behaviors based on deep learning methods, and can be deployed on the cloud or terminal to meet the needs of various application scenarios. Typical application scenarios: (1) All-weather pedestrian detection, face detection, vehicle detection, and fireworks detection in the monitoring area; (2) Smoking and mobile phone behavior detection at gas stations, laboratories, and factory sites; (3) Detection of illegal driving behaviors, such as fatigue detection (closed eyes, yawning, dozing off, etc.), making phone calls, and inattention.
With the advent of the industrial "5.0" era, personalized customization has been pushed into the wave of the times, and monochrome 3D printing technology has been difficult to meet the diverse and personalized needs of users. The SLA full-color 3D printers that have been put into use on the market are bulky, expensive, and complicated to operate, and cannot satisfy more and more individual users. To this end, the team solved challenging problems such as uneven mixing of consumables, narrow color coverage, and low color accuracy by designing a new type of mixing nozzle, developing a color mixing scheme for consumables, and improving the 3D printing control system. Design and manufacturing provides a new technical solution, which plays an important role in promoting the development of full-color 3D printing technology. Research results have broad development prospects and huge development potential in the fields of cultural creativity, medical and health care, cultural heritage restoration, and remote sensing mapping.
In order to reduce the cost and space occupancy of the entire machine vision system, the team integrated the image acquisition unit, image processing unit, and analysis unit into an overall device based on an embedded platform, and finally developed a complete set covering hardware systems, software systems, Smart vision system for user interface. The whole system involves the control and deployment of the light source system, the overall construction and integration of the vision system, and the research and design of the system function algorithm. At the same time, the follow-up function expansion of the vision system needs to be considered, and the system function can be customized according to different application scenarios. Nowadays, the embedded vision system has the characteristics of good portability, high cost performance, low power consumption, high reliability and easy programming, etc., and has become the main direction of the development of the vision system. significance.
Domestic research on intelligent optimization and prediction systems for blasting is less, and there are problems such as backward technology and low efficiency. There is an urgent need for a good solution to use artificial intelligence technology and computer-aided design in blasting engineering to meet the production requirements for high-quality blasting effects. Require. This project takes the open-pit mine step loosening blasting as the research object, and develops a set of artificial intelligence-based blasting optimization and prediction software system: on the basis of realizing AI prediction and optimization, it embeds the empirical formula commonly used in the industry, and realizes the prediction and optimization based on artificial intelligence. It is an international commercial software system that meets the needs of various usage scenarios at the same time. The system meets the main performance indicators of commercial software such as stability, compatibility, and safety, and the blasting optimization prediction results meet the design requirements of on-site blasting, and achieve the goal of formal use in China and formal trials abroad.