Brazilian Sign Language Recognition

Sign language is a way of communication with these people and among themselves, characterised by visual-gestural communication. Likely to oral languages, sign languages are unique to each culture, with their own grammatical structures. Its structure unit, called “sign”, comprises hand configuration, location and movements, palm orientation, face and torso movement, which are manual and non-manual parameters.

Sign language recognition is considered the most important and challenging application in gesture recognition, involving the fields of pattern recognition, machine learning and computer vision.

One challenge in Brazilian Sign Language (Libras) recognition is the absence of a robust dataset that allows the validation of different methodologies. For this aim, our team developed datasets, which we are using with different Machine Learning techniques to help deaf developing communication with deaf people and human-computer interaction based on visual signs.

First dataset: Brazilian Sign Language data set with 34 signs for sign language and gesture recognition benchmark: (1) person, (2) to spread, (3) to copy, (4) to catch, (5) to gather, (6) to disappear, (7) to look, (8) fair, (9) truth, (10) weight, (11) justice, (12) who, (13) nothing, (14) to believe, (15) to forget, (16) to love, (17) to afflict, (18) to commemorate, (19) rancor, (20) assembly meeting, (21) to compare, (22) to scream, (23) to speak, (24) to absorb, (25) to fatten, (26) to quarrel, (27) perspicacious, (28) to shine, (29) maid, (30) to replace, (31) prison, (32) television, (33) yesterday and (34) future. Each one of these signs was recorded 5 times by 1 signaller, totaling a database of 170 samples. The signs were captured using an RGB-D sensor (Microsoft Kinect) and processed by nuiCaptureAnalyze software. This dataset is also publicly available at Almeida (2014). Some scientific works related to this dataset are Almeida et al. (2013), Almeida et al. (2014a) and Almeida (2014b).

Second dataset: Brazilian Sign Language (Libras) data set with 10 signs for sign language and gesture recognition benchmark: (1) to calm down, (2) to accuse, (3) to annihilate, (4) to love, (5) to gain weight , (6) happiness , (7) slim, (8) lucky , (9) surprise and (10) angry. Each one of these signs was recorded 10 times by 1 signaller, totaling a database of 100 samples. The signs were captured using an RGB-D sensor (Microsoft Kinect) and processed by nuiCaptureAnalyze software. This dataset is publicly available at Zenodo (ALMEIDA, et al. 2019a). Some scientific works related to this dataset are Rezende et al. (2016a), Rezende et al. (2016b), Rezende (2016c), Rezende et al. (2017) and Guerra et al. (2018).

Third dataset: Brazilian Sign Language (Libras) data set with 20 signs for sign language and gesture recognition benchmark: (1) to happen, (2) student, (3) yellow, (4) America, (5) to enjoy, (6) candy, (7) bank, (8) bathroom, (9) noise, (10) five, (11) to know, (12) mirror, (13) corner, (14) son, (15) apple, (16) fear, (17) bad, (18) frog, (19) vaccine and (20) will. Each one of the signs was recorded 5 times by 12 signers, using a Chroma Key background. Among the signers are men and women with basic to advanced knowledge in Libras. The RGB-D sensor (kinect v2) available the RGB videos (1920 x 1080) and depth videos (640 x 480) in "mp4" format, and the body points and face data in "txt" file. The body file has seven different information (Position X, Y and Z; Orientation X, Y and Z; TrackingState; LeftHandState; RightHandState; ColorPosition X and Y; and DepthPosition X and Y) about the 25 points: (1) Spine Base, (2) Spine Mid, (3) Neck, (4) Head, (5) Shoulder Left, (6) Elbow Left, (7) Wrist Left, (8) Hand Left, (9) Shoulder Right, (10) Elbow Right, (11) Wrist Right, (12) Hand Right, (13) Hip Left, (14) Knee Left, (15) Ankle Left, (16) Foot Left, (17) Hip Right, (18) Knee Right, (19) Ankle Right, (20) Foot Right, (21) Spine Shoulder, (22) Hand Tip Left, (23) Thumb Left, (24) Hand Tip Right and (25) Thumb Right. There are 13 lines (or data) for each frame. This order is repeated sequentially up to 1950 lines (13 linesx150 frames), representing the sign video. Regarding to the face data, the same organisation was adopted. In this case, we have seven information (FaceBox, FaceRotation, HeadPivot, AnimationUnit, FaceModel X, Y and Z; ColorFaceModel X and Y; and DepthFaceModel X and Y), about the 1347 face points. Its describing 11 data, distributed in 1650 (11 linesx150 frames) lines in the ``.txt'' file. This dataset is also publicly available at Zenodo (ALMEIDA, et al. 2019b and ALMEIDA, et al. 2020). Some scientific works related to this dataset are Almeida et al. (2017a), Almeida (2017b), Castro et al. (2019), Mendes (2019) and Guerra (2019).

References:

Almeida, Silvia Grasiella Moreira; Guimarães, Frederico Gadelha and Ramírez, Jaime Arturo. A Methodology for Feature Extraction in Brazilian Sign Language Recognition. In: Wokshop de Visão Computacional. (2013). Available in: http://iris.sel.eesc.usp.br/wvc/Anais_WVC2013/Poster/1/12.pdf.

Almeida, Silvia Grasiella Moreira. (2014). Libras-34 Dataset (Kinect v1) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4451526.

Almeida, Sílvia Grasiella Moreira; Guimarães, Frederico Gadelha; Ramírez, Jaime Arturo. Feature extraction in Brazilian Sign Language Recognition based on phonological structure and using RGB-D sensors. Expert Systems with Applications, v. 41, n. 16, p. 7259-7271, 2014a. Available in: https://www.sciencedirect.com/science/article/pii/S0957417414003042.

Almeida, Sílvia Grasiella Moreira. Extração de Características em Reconhecimento de Parâmetros Fonológicos da Língua Brasileira de Sinais utilizando Sensores RGB-D. (2014b). 145f. Doctoral thesis - Universidade Federal de Minas Gerais, Minas Gerais. Available in https://www.ppgee.ufmg.br/defesas/303D.PDF.

Rezende, Tamires Martins; Castro, Cristiano Leite; Mota, Felipe Augusto Olivera, Nametala, Ciniro Aparecido Leite and Corrêa, Ramon Santos. Reconhecimento de expressões faciais em sinais da língua brasileira de sinais (Libras) utilizando os classificadores K-NN e SVM. In XII Simpósio de Mecânica Computacional (SIMMEC 2016). (2016a). Available in http://acervo.ufvjm.edu.br/jspui/handle/1/1556.

Rezende, Tamires Martins; Castro, Cristiano Leite and Almeida, Silvia Grasiella Moreira (2016b). An approach for Brazilian Sign Language (BSL) recognition based on facial expression and k-NN classifier. In 29th SIBGRAPI, Workshop on Face Processing Applications, on Proceedings (pp. 1-2). Available in http://gibis.unifesp.br/sibgrapi16/eproceedings/wfpa/6.pdf.

Rezende, Tamires M., Aplicação de Técnicas de Inteligência Computacional para Análise da Expressão Facial em Reconhecimento de Sinais de Libras. 2016c. 108f. Masters Dissertation - Universidade Federal de Minas Gerais, Minas Gerais, 2016. Available in https://www.ppgee.ufmg.br/defesas/1393M.PDF

Almeida, Silvia Grasiella Moreira; Toffolo, Andreia Chagas Rocha; Guimarães, Frederico Gadelha; and Castro, Cristiano Leite. Protocolo para construção de uma base de sinais da língua brasileira de sinais utilizando sensores RGB-D em plataforma de software livre. In Anais do Encontro Virtual de Documentação em Software Livre e Congresso Internacional de Linguagem e Tecnologia Online (Vol. 6, No. 1). (2017a). Available in http://www.periodicos.letras.ufmg.br/index.php/anais_linguagem_tecnologia/article/view/12162.

Rezende, Tamires Martins; Castro, Cristiano Leite; Almeida, Silvia Grasiella Moreira and Guimarães, Frederico Gadelha. Análise da Expressão Facial em Reconhecimento de Sinais de Libras. In: VI Simpósio Brasileiro de Automação Inteligente. 2017. p. 465-470. Available in https://www.ufrgs.br/sbai17/papers/paper_152.pdf.

Almeida, Gabriela Tolentino Boaventura. Criação de Banco de Sinais de Libras para Implementação de Sistemas com Visão Computacional. 2017b. 77f. Trabalho de Conclusão de Curso - Universidade Federal de Minas Gerais, Minas Gerais, 2017.

Guerra, Rubia Reis; Rezende, Tamires Martins; Guimarães, Frederico Gadelha and Almeida, Silvia Grasiella Moreira. Facial Expression Analysis in Brazilian Sign Language for Sign Recognition. In: Anais do XV Encontro Nacional de Inteligência Artificial e Computacional. SBC, 2018. p. 216-227. Available in https://sol.sbc.org.br/index.php/eniac/article/view/4418.

Almeida, Silvia Grasiella Moreira; Rezende, Tamires Martins; Toffolo, Andreia Chagas Rocha and Castro, Cristiano Leite. (2019a). Libras-10 Dataset [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3229958.

Almeida, Sílvia Grasiela Moreira; Rezende, Tamires Martins; Almeida, Gabriela Tolentino Boaventura; Toffolo, Andreia Chagas Rocha; and Guimarães, Frederico Gadelha. (2019b). MINDS-Libras Dataset [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2667329

Castro, Giulia Z.; Guerra, Rubia R.; Assis, Moises M.; Rezende, Tamires M.; Almeida, Gabriela T.B.; Almeida, Silvia, G.M.; Castro, Cristiano L.; Guimaraes, Frederico G.. Desenvolvimento de uma Base de Dados de Sinais de Libras para Aprendizado de Máquina: Estudo de Caso com CNN 3D. In: VI Simpósio Brasileiro de Automação Inteligente. 2019. Available at https://tinyurl.com/y4rtoxum.

Mendes, Moises. Aplicação de Deep Learning no reconhecimento de sinais de Libras: aspectos técnicos e sociais. 2019. 51f. Trabalho de Conclusão de Curso - Universidade Federal de Minas Gerais, Minas Gerais, 2019.

Guerra, Rúbia Reis. Deep Learning for Accessibility: Detection and Segmentation of Regions of Interest for Sign Language. 2019. 58f. Trabalho de Conclusão de Curso - Universidade Federal de Minas Gerais, Minas Gerais, 2019. Available in https://rubia-rg.github.io/projects/TCCI_RubiaGuerra.pdf

Almeida, Silvia Grasiella Moreira; Rezende, Tamires Martins; Toffolo, Andreia Chagas Rocha and Castro, Cristiano Leite. (2020). MINDS-Libras Dataset (RGB-D sensor data) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4322984.