[Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

Time:2024-4-30

If you need source code and datasets, please like and follow the collection and leave a private message in the comment section~~~.

I. Introduction to Q&A Intelligent Customer Service

QA Question and Answer is an abbreviation of Question-and-Answer, which retrieves answers based on the questions posed by the user and answers the user in natural language that the user can understand. QA customer service focuses on one-question-one-answer processing, with an emphasis on reasoning about knowledge.

From the application domain perspective, Q&A systems can be categorized into limited domain Q&A systems and open domain Q&A systems.

Depending on the document repository and knowledge base that support the Q&A system to generate answers, and the technical classification of the implementation, they can be classified as natural language database Q&A systems, conversational Q&A systems, reading comprehension systems, Q&A systems based on commonly used sets of questions, and Q&A systems based on knowledge bases, and so on.

Intelligent Q&A Customer Service Functional Architecture

A typical Q&A system includes question input, question comprehension, information retrieval, information extraction, answer sorting, answer generation and result output, etc. Firstly, the user asks a question, the retrieval operation obtains the relevant information by querying the knowledge base, and extracts the corresponding candidate answer feature vectors from the extracted information according to a specific rule, and finally filters the candidate answer results and outputs them to the user.

[Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

Intelligent Q&A Customer Service Framework

1: Problem Processing The problem processing process identifies the information contained in the problem, determines the problem’s subject information and subject category attribution, such as whether it belongs to the general class of problems or a specific topic class of problems, and then extracts key information related to the topic, such as character information, location information and time information.

2: Problem Mapping Problem mapping is performed to remove ambiguity based on user inquiries. The mapping problem is solved by string similarity matching and synonym table, etc. Split and merge operations are performed as needed.

3: Query Construction By processing the input question, the question is transformed into a query language that can be understood by the computer, and then the knowledge graph or database is queried to obtain the corresponding alternative answers through retrieval.

4: Knowledge Reasoning Reasoning based on the attributes of the question, if the basic attributes of the question belong to the known definition information in the knowledge graph or database, then it can be found from the knowledge graph or database, and the answer is returned directly. If the problem attributes are undefined, then the answer needs to be generated by machine algorithmic reasoning.

5: Disambiguation Sorting Based on one or more alternative answers returned by the query in the knowledge graph, disambiguation and prioritization are performed in combination with the question attributes to output the best answer.

Second, intelligent medical customer service Q&A practice

Customized intelligent customer service programs generally need to be implemented to select the corpus, remove the noise information After the algorithm is trained on the pre-feeds, and finally provide the human-computer interface Q&A dialog, based on the medical corpus obtained from the Internet and through the cosine similarity basic principle, the following Q&A intelligent medical customer service application is designed and developed

The project is structured as follows

[Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

Effective display

Here are some of the cases defined in the csv file

[Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

Pre-defined welcome statements

 [Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

 [Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

Run the chatrobot file, the following window will pop up, output the question and click Submit Inquiry.

 [Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

Answers to questions not in the corpus are automatically inferred (usually less accurately)

 [Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

 [Python natural language processing + tkinter graphical interface] to achieve intelligent medical customer service Q&A robot combat (with source code, dataset, demo ultra-detailed)

III. Codes

Part of the code is as follows All code and datasets please like to follow the collection after the comments section message private message

# -*- coding:utf-8 -*-
from fuzzywuzzy import fuzz
import sys
import jieba
import csv
import pickle
print(sys.getdefaultencoding())

import logging
from fuzzywuzzy import fuzz
import math
from scipy import sparse
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from scipy.sparse import lil_matrix
from sklearn.naive_bayes import MultinomialNB
import warnings
from tkinter import *
import time
import difflib
from collections import Counter
import numpy as np


filename = 'label.csv'

def tokenization(filename):


    corpus = []
    label = []
    question = []
    answer = []
    with open(filename, 'r', encoding="utf-8") as f:
        data_corpus = csv.reader(f)
        next(data_corpus)
        for words in data_corpus:
            word = jieba.cut(words[1])
            tmp = ''
            for x in word:
                tmp += x
            corpus.append(tmp)
            question.append(words[1])
            label.append(words[0])
            answer.append(words[2])
    
    with open('corpus.h5','wb') as f:
        pickle.dump(corpus,f)
    with open('label.h5','wb') as f:
        pickle.dump(label,f)
    with open('question.h5', 'wb') as f:
        pickle.dump(question, f)
    with open('answer.h5', 'wb') as f:
        pickle.dump(answer, f)

    return corpus,label,question,answer



def train_model():

    with open('corpus.h5','rb') as f_corpus:
        corpus = pickle.load(f_corpus)

    with open('label.h5','rb') as f_label:
        label = pickle.load(f_label,encoding='bytes')


    vectorizer = CountVectorizer(min_df=1)
    transformer = TfidfTransformer()
    tfidf = transformer.fit_transform(vectorizer.fit_transform(corpus))
    words_frequency = vectorizer.fit_transform(corpus)
    word = vectorizer.get_feature_names()
    saved = tfidf_calculate(vectorizer.vocabulary_,sparse.csc_matrix(words_frequency),len(corpus))
    model = MultinomialNB()
    model.fit(tfidf,label)


    with open('model.h5','wb') as f_model:
        pickle.dump(model,f_model)

    with open('idf.h5','wb') as f_idf:
        pickle.dump(saved,f_idf)

    return model,tfidf,label
    
    
    
    
class tfidf_calculate(object):
    def __init__(self,feature_index,frequency,docs):
        self.feature_index = feature_index
        self.frequency = frequency
        self.docs = docs
        self.len = len(feature_index)

    def key_count(self,input_words):
        keys = jieba.cut(input_words)
        count = {}
        for key in keys:
            num = count.get(key, 0)
            count[key] = num + 1
        return count

    def getTfidf(self,input_words):
        count = self.key_count(input_words)
        result = lil_matrix((1, self.len))
        frequency = sparse.csc_matrix(self.frequency)
        for x in count:
            word = self.feature_index.get(x)
            if word != None and word>=0:
                word_frequency = frequency.getcol(word)
                feature_docs = word_frequency.sum()
                tfidf = count.get(x) * (math.log((self.docs+1) / (feature_docs+1))+1)
                result[0, word] = tfidf
        return result    

if __name__=="__main__":
    tokenization(filename)
    train_model()

It’s not easy to create. If you find it helpful, please like it and favorite it~~~.

Recommended Today

vivado Error Summary 1 — WARING:[Labtools 27-3361] the debug hub core was not detected make sure the clock

I can’t open the debug screen of debug after program device, I get the following error: WARING:[Labtools 27-3361] the debug hub core was not detected make sure the clock connected to the debug hub core is a free running clock and is active make sure the BSCAN_SWITCE_USER_MASK device property in vivado hardware manager reflects the […]