• 基于语义分布相似度的主题模型

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-10-11 Cooperative journals: 《计算机应用研究》

    Abstract: The latent Dirichlet allocation (LDA) is a popular three-layer Bayesian probability model that implements clustering of words in text and text at the topic level. LDA is based on the bag-of-words, which simplifies the complexity of modeling, but makes the semantic coherence of topics poor, and text representation ability is not strong. To solve this problem, this paper came up with the semantic distribution similarity based topic model. This model uses GPU (generalized P髄ya urn) model to add word-word and document-topic semantic distribution similarity to guide topic modeling under the framework of EM (Expectation Maximization) algorithm, which weakened the effect of bag-of-words hypothesis on topics from the semantic association level. Experiments on four public datasets show that the semantic distribution similarity based topic model is superior to the currently popular topic modeling algorithms in terms of topic semantic coherence and text classification accuracy, and the model improves the convergence speed and topic accuracy.

  • 基于多层感知机的蛋白质变性温度预测

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-24 Cooperative journals: 《计算机应用研究》

    Abstract: It is significant to predict accurate protein melting temperature in protein engineering and drug design. In this paper, we proposed a novel weight-based dimensionality reduction algorithm, and applied it to obtain the input features of MLP model by using combination with global and sequential features as preliminary features. On blind test sets, the PCC value of predicted and experimental melting temperatures increased from 0.77 to 0.8, and RMSE value decreased from 0.17 to 0.16. The classification accuracy of predicted melting temperatures by our algorithm was significantly improved over the up-to-date service.

  • 多粒度时序特征在离网预测中的应用

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-02 Cooperative journals: 《计算机应用研究》

    Abstract: Telecom operators have developed multiple churn prediction models to find potential users for different scenes. The present churn prediction models firstly select a kind of time granularity to extract features, then model the extracted data using machine learning algorithm. Such approaches only consider the influence of the model on classification performance, but the role of data is not fully considered. To solve this problem, this paper proposed a method which extracts multi-grain temporal features, and try to integrate different granularity features at different training phases. Experimental results show that the performance of the model trained with multi-grain features is obviously superior than that trained with single granularity features.

  • 基于LDA主题模型的用户电信轨迹恢复算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-24 Cooperative journals: 《计算机应用研究》

    Abstract: With the development of mobile communication technology and the popularization of mobile devices, the daily track record data become rich. Massive track data hides valuable knowledge about person and human society. In order to make the knowledge model generated based on the trajectory data more accurate and effective to serve the users, it is particularly important to be able to recover the missing telco trajectories accurately and reliably. Currently, most of the methods mainly focus on modeling continuous trajectories such as GPS trajectories, but lack of researches on the restoration of telco trajectories generated in mobile communication scenarios. Therefore, it have transformed the problem of telecommunication trajectory recovery into a matrix completion problem, and proposed a recovery algorithm based on the LDA topic model. In the experiment, it make a comprehensive comparison with the traditional matrix completion algorithm and observe the effect of different parameters on trajectory recovery. The experimental results show that compared with the traditional matrix completion algorithm, the LDA topic model can significantly improve the recovery accuracy of missing telco tracks.

  • 群智感知中基于社交属性及有效用户计算的任务分发机制

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-19 Cooperative journals: 《计算机应用研究》

    Abstract: Mobile Crowd Sensing (MCS) is the core of mobile computing with the rapid development of wireless sensor networks and the popularity of mobile intelligent devices. The use of crowd sensing can be completed large-scale, complex environment and social awareness tasks, which the task distribution is an important part of this application. In order to solve the problem that the perceived environment is complex, the number of users can not meet the requirements and the quality of the collected data is low. we propose a task distribution mechanism (EUC) based on social attribute and effective user calculation. According to the task to filter the characteristics of the user, from the user point of view, EUC consider the user's social, the user's social network through the relevant information to increase the number of effective users; from the platform point of view, EUC according to the task of receiving and Submit the situation, dynamically adjust the effective user's points, thus ensuring the effective number of users of the entire system. Theoretical analysis and experimental results show that the mechanism proposed in this paper can improve the task distribution efficiency of the system and improve the quality of the collected data.

  • 基于节点社会性的无线网络编码传输策略研究

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-17 Cooperative journals: 《计算机应用研究》

    Abstract: Delay Tolerant Network(DTN) is a new network architecture which lacks continuous connectivity, choosing the appropriate forwarding node is a key of efficient forwarding and delivery. Due to the dynamic change of node mobility and network topology affect the transmission efficiency of DTN, this paper proposes a new DTN network model NSNC-DTN based on nodal sociality and random linear network coding. NSNC-DTN network model chooses the most appropriate forwarding node by the community structure, the similarity of community and activity of the nodes in the network. It calculates nodal sociality in offline mode and uses random linear network coding for Source nodes and Center nodes, implements messages forwarding online to reach the goal of efficient forwarding and delivery. The simulation result shows that NSNC-DTN network model can effectively improve the success rate of information delivery and reduce network delay and network overhead.