![]() Information classification program, information classification method, and information processing ap
专利摘要:
Abstract An information classification program includes the following: accepting a search keyword; retrieving, from multiple information items (111) posted by multiple users, a posted information item (111) including the accepted search keyword, each of the multiple posted information items (111) including at least either of a text information item (113) and an image information item (112), and acquiring posted information items (111) which are within a predetermined chronological range with respect to the posted information item (111) including the search keyword; and classifying, as image information items (112) related to the search keyword, some of image information items (112) included in the posted information items (111) that have been acquired, and performing first determination of, for each of the classified image information items (112), whether or not a user who posted an information item (111) including the classified image information item (112) took an action related to the search keyword. CD C co-) co , co S0 0 w 0 0 ~-j 0 0 -U U ) u) u ~~~~ <: C 0HWH o U)< < U0 <0 0 0i 0 U)f <L m 0 0 -j -... - i 0 0- W - -- a- - < 0 ~ 0 0 C < C)W WU C C) C) Q < w C: CD~~C CNCOI- L o r- 0 F CCJ w-J 0- 0 F-- _F- LLF- 0 0IU 0 D CD L zUD ~ ~ H 5U Ed 0 0D < O = a_ co Cl) 0~~ 0 0 D0 =D~ H-: - = LL 05 Of CID WZ - L 0 _L 0D L WU)CI LLU H0 H0 LD < = a ~ 0 0 W0-HU) C) 0 公开号:AU2013201006A1 申请号:U2013201006 申请日:2013-02-22 公开日:2014-03-20 发明作者:Motofumi Fukui;Noriji Kato;Tomoko Okuma 申请人:Fuji Xerox Co Ltd; IPC主号:G06F17-27
专利说明:
-2 INFORMATION CLASSIFICATION PROGRAM, INFORMATION CLASSIFICATION METHOD, AND INFORMATION PROCESSING APPARATUS DESCRIPTION Background (i) Technical Field [0001] The present invention relates to an information classification program, an information classification method, and an information processing apparatus. (ii) Related Art [0002] Any discussion of the prior art throughout the specification should in no way be considered as an admission that such prior art is widely known or forms part of common general knowledge in the field. [0003] In the related art, an information processing apparatus has been proposed, which classifies a text information item transmitted and posted on a website, by determining whether or not a person who transmitted the text information item took an action related to the text information item (for example, Kentaro Inui, Kazuo Hara, "Experience Mining: Extraction and Classification of Experiences of Individual Persons from Web Documents", proceedings of the fourteenth annual meeting of the Association for Natural Language Processing, issued 2008, pp. 1077-1080, hereinafter, referred to as "NPL"). [0004] The information processing apparatus disclosed in NPL acquires a text information item transmitted and posted on a website. The information processing apparatus analyzes text items included in the acquired text information item, and extracts, from the text items, items such as a topic, a subject with an experience, a case type, facticity, and a case expression. Moreover, the information processing apparatus further extracts, using specific words or functional words, from these extracted items, a time information item (a tense/phase), and information items concerning the polarity of the content of the text information item (positive/negative) and modality (the attitude of a person who transmitted the text information item). Accordingly, the information processing apparatus classifies the text information item by -3 determining whether or not the person who transmitted the text information took an action related to the text information item. Summary [0005] It is an object of the present invention to provide an information classification program, an information classification method, and an information processing apparatus that classify posted information items including text information items and image information items, each of the posted information items having any ratio between the number of text information items and the number of image information items, by determining whether or not an action related to a provided keyword was taken. [0006] In order to achieve the above-mentioned object, according to aspects of the present invention, there are provided an information classification program, an information classification method, and an information processing apparatus. [0007] According to a first aspect, there is provided an information classification program. The information classification program includes the following: accepting a search keyword; retrieving, from multiple information items posted by multiple users, a posted information item including the accepted search keyword, each of the multiple posted information items including at least either of a text information item and an image information item, and acquiring posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword; and classifying, as image information items related to the search keyword, some of image information items included in the posted information items that have been acquired, and performing first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword. [0008] According to a second aspect, in the information classification program according to the first aspect, in the acquiring, a posted information item including the search keyword is retrieved from the multiple posted information items, and, among other posted information items of a user who posted the information item including the search keyword, posted information items that are within a predetermined chronological range with respect to the posted information item including the search keyword are acquired. [0009] According to a third aspect, in the information classification program according to the first or second aspect, in the classifying and performing first determination, text information -4 items included in the posted information items that have been acquired are classified, and, for each of the classified text information items, whether or not a user who posted an information item including the classified text information item took the action related to the search keyword is determined. [0010] According to a fourth aspect, in the information classification program according to any one of the first to third aspects, the process further includes classifying text information items included in the posted information items that have been acquired, and, performing second determination of, for each of the classified text information items, whether the content of a posted information item including the classified text information item is favorable or unfavorable. [0011] According to a fifth aspect, in the information classification program according to the fourth aspect, in the classifying and performing second determination, image information items included in the posted information items that have been acquired are classified, and, for each of the classified image information items, whether the content of a posted information item including the classified image information item is favorable or unfavorable is determined. [0012] According to a sixth aspect, there is provided an information classification method. The information classification method includes the following: accepting a search keyword; retrieving, from multiple information items posted by multiple users, a posted information item including the accepted search keyword, each of the multiple posted information items including at least either of a text information item and an image information item, and acquiring posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword; and classifying, as image information items related to the search keyword, some of image information items included in the posted information items that have been acquired, and performing first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword. [0013] According to a seventh aspect, there is provided an information processing apparatus. The information processing apparatus includes an accepting unit, an acquisition unit, and a first determination unit. The accepting unit accepts a search keyword. The acquisition unit retrieves, from multiple information items posted by multiple users, a posted information item including the search keyword accepted by the accepting unit, each of the multiple posted information items including at least either of a text information item and an -5 image information item, and acquires posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword. The first determination unit classifies, as image information items related to the search keyword, some of image information items included in the posted information items acquired by the acquisition unit, and performs first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword. [0014] According to each of the first, sixth, and seventh aspects of the present invention, the posted information items including text information items and image information items, each of the posted information items having any ratio between the number of text information items and the number of image information items, can be classified by determining whether or not the action related to the provided keyword was taken. [0015] According to the second aspect of the present invention, among the posted information items including text information items and image information items, each of the posted information items having any ratio between the number of text information items and the number of image information items, for a posted information item including the search keyword, other posted information items of a user who posted the information item including the search keyword can be classified by determining whether or not the action related to the provided keyword was taken. [0016] According to the third aspect of the present invention, the posted information items including text information items and image information items, each of the posted information items having any ratio between the number of text information items and the number of image information items, can be classified, on the basis of the text information items, by determining whether or not the action related to the provided keyword was taken. [0017] According to the fourth aspect of the present invention, the posted information items including text information items and image information items, each of the posted information items having any ratio between the number of text information items and the number of image information items, can be classified, on the basis of the text information items, by determining whether the content of each of the posted information items is favorable or unfavorable. [0018] According to the fifth aspect of the present invention, the posted information items including text information items and image information items, each of the posted information -6 items having any ratio between the number of text information items and the number of image information items, can be classified, on the basis of the image information items, by determining whether the content of each of the posted information items is favorable or unfavorable. Brief Description of the Drawings [0019] Exemplary embodiments of the present invention will be described in detail based on the following figures, wherein: [0020] Fig. 1 is a block diagram illustrating an example of a configuration of a microblog classification system according to an exemplary embodiment of the present invention; [0021] Fig. 2 is a block diagram illustrating an example of a configuration of a microblog classification server; [0022] Fig. 3 is a schematic diagram illustrating an example of a configuration of microblog information items; [0023] Fig. 4 is a schematic diagram illustrating an example of a configuration of microblog image information items; [0024] Fig. 5 is a schematic diagram illustrating an example of a configuration of microblog text information items; [0025] Fig. 6 is a schematic diagram illustrating an example of a configuration of an image list display screen that is displayed on a display of a terminal by an image list display unit; [0026] Fig. 7 is a schematic diagram illustrating an example of a configuration of classification-result information items; [0027] Fig. 8 is a schematic diagram illustrating an example of a configuration of a display screen that is output by a classification-result output unit on the basis of the classification result information items; [0028] Fig. 9 is a flowchart illustrating an example of an operation of the microblog classification server in the case of generating an image classification model; -7 [0029] Fig. 10 is a flowchart illustrating an example of an operation of the microblog classification server in the case of acquiring the microblog information items; [0030] Fig. 11 is a flowchart illustrating an example of an operation of the microblog classification server in the case of determining whether or not a user who posted an information item, which is referred to as a posted information item, took an action; and [0031] Fig. 12 is a flowchart illustrating an example of an operation of the microblog classification server in the case of determining whether the content of a posted information item is positive or negative. Detailed Description Exemplary Embodiment Configuration of microblog classification system [0032] Fig. 1 is a schematic diagram illustrating an example of a configuration of a microblog classification system according to an exemplary embodiment of the present invention. [0033] A microblog classification system 6 includes a microblog classification server 1, a microblog server 2, a terminal 3, and a terminal 4, and connects, using a network 5, the individual apparatuses so that the apparatuses are able to communicate with each other. Here, a microblog is a medium in which multiple text information items and image information items that were posted (transmitted) are mixed and displayed in chronological order. More specifically, microblog information items stored in the microblog server 2 are subjected to a display process by an information processing apparatus such as the terminal 3 or 4, whereby the microblog is displayed. Hereinafter, the unit of an information item posted on the microblog is referred, for simplicity, to as a "posted information item". It is supposed that a posted information item includes a text information item and an image information item, includes only a text information item, or includes only an image information item. In other words, each of the microblog information items includes multiple posted information items. Furthermore, posted information items are information items posted by multiple users. [0034] The microblog classification server 1 is an information processing apparatus that includes electronic components such as a central processing unit (CPU) having functions for -8 processing information items and a memory. The microblog classification server 1 accepts a search keyword, acquires posted information items related to the search keyword from the microblog server 2, and classifies the acquired posted information items. Each of the acquired posted information items is classified by determining whether or not a user who posted the information item that has been acquired as a posted information item took an action related to the search keyword, and by determining whether the content of the posted information item that has been acquired is positive (favorable) or negative (unfavorable). [0035] The microblog server 2 is an information processing apparatus that includes electronic components such as a CPU having functions for processing information items and a memory. The microblog server 2 accepts text information items such text items and/or image information items concerning still images such as photographs or moving images, which have been transmitted from the terminal 4 or the like and which are to be referred to as posted information items, and generates microblog information items for displaying the posted information items in chronological order. Moreover, when the microblog server 2 accepts, from the terminal 4, a request to view the microblog information items, the microblog server 2 transmits the microblog information items to the terminal 4. Note that it is supposed that an image information item included in a posted information item directly includes an information item concerning a still image or a moving image or includes a link destination information item concerning a link destination in which an information item concerning a still image or a moving image is stored. Furthermore, a text information item included in a posted information item may directly include an information item concerning a text item or may include a link destination information item concerning a link destination in which an information item concerning a text file, a hypertext markup language (HTML) file, or the like is stored. [0036] The terminal 3 includes an operation unit such as a keyboard or mouse used to input an instruction for an operation, a display such as a liquid crystal display, a controller such as a CPU having functions for processing information items, and a memory such as a hard disk drive (HDD). The terminal 3 accepts a search keyword input from a user, transmits the search keyword to the microblog classification server 1, and requests the microblog classification server 1 to perform classification of the microblog information items. When classification results are output from the microblog classification server 1, the terminal 3 receives the classification results, and displays the classification results on the display. [0037] Note that the terminal 3 is, for example, a personal computer. Alternatively, a mobile phone, a personal digital assistant (PDA), or the like may be used as the terminal 3. -9 Furthermore, although one terminal 3 is illustrated in Fig. 1, the number of terminals 3 may be two or more. [0038] The terminal 4 includes an operation unit such as a touch panel used to input an instruction for an operation, a display such as a liquid crystal display provided under the touch panel, and a controller having electronic components such as a CPU and a memory. The terminal 4 transmits an information item, which is to be referred to as a posted information item, such as a text item or an image, to the microblog server 2 in accordance with an operation performed by a user, thereby posting the information item on the microblog. Moreover, the terminal 4 transmits, to the microblog server 2, in accordance with an operation performed by the user, a request to view the microblog. When the terminal 4 receives the microblog information items from the microblog server 2 as a result of the request to view the microblog, the terminal 4 displays, on the display, text items or images (still images or moving images) included in posted information items of the microblog. [0039] Note that the terminal 4 is, for example, a mobile phone. Alternatively, a PDA, a personal computer, or the like may be used as the terminal 4. Furthermore, although one terminal 4 is illustrated in Fig. 1, the number of terminals 4 may be two or more. [0040] The network 5 is a communication network such as the Internet or a local area network (LAN), regardless of whether the network 5 is a wired network or wireless network. Configuration of microblog classification server [0041] Fig. 2 is a block diagram illustrating an example of a configuration of the microblog classification server 1. [0042] The microblog classification server 1 includes a controller 10, a memory 11, and a communication section 12. The controller 10 is constituted by a CPU or the like, and controls individual units and executes various types of programs. The memory 11 is provided as an example of a storage device that is constituted by a recording medium such as an HDD or a flash memory and that stores information items. The communication section 12 communicates with an external apparatus via the network 5. [0043] The controller 10 executes a microblog classification program 110, which is described below, thereby functioning as an image list display unit 100, a related-image selection unit 101, an image-classification-model generating unit 102, a search-keyword - 10 accepting unit 103, a microblog-information acquisition unit 104, an image-information classification unit 105, a text-information classification unit 106, an action taken/not-taken determination unit 107, a positive/negative determination unit 108, a classification-result output unit 109, and so forth. [0044] The image list display unit 100 generates, for all or a predetermined number of microblog information items that have been acquired by the microblog-information acquisition unit 104 described below, an information item for performing list display of image information items included in the microblog information items on the display of the terminal 3, and transmits the generated information item to the terminal 3. [0045] In a state in which the information item generated by the image list display unit 100 is displayed on the display of the terminal 3, the related-image selection unit 101 selects, in accordance with an operation performed by the user of the terminal 3, some images from the images that are being displayed. [0046] The image-classification-model generating unit 102 extracts, as learning data items of positive examples, feature values from the images selected by the related-image selection unit 101. The image-classification-model generating unit 102 extracts, as learning data items of negative examples, feature values from images that are not selected by the related-image selection unit 101. The image-classification-model generating unit 102 generates, using the learning data items, an image classification model 114 in relation to a search keyword accepted by the search-keyword accepting unit 103. Note that a generating method will be described below. [0047] The search-keyword accepting unit 103 accepts a search keyword from the terminal 3. [0048] The microblog-information acquisition unit 104 acquires, from the microblog server 2, microblog information items 111 related to the search keyword accepted by the search keyword accepting unit 103, and stores the microblog information items 111 in the memory 11. Note that the microblog information items 111 may be acquired from all of the microblog information items accumulated in the microblog server 2, or may be acquired from microblog information items that have been obtained by filtering using a predetermined time period, a predetermined keyword, or the like. A method for acquiring the microblog information items 111 will be described below. - 11 [0049] The image-information classification unit 105 stores information items that have been obtained by removing text information items from individual posted information items included in the microblog information items 111 acquired by the microblog-information acquisition unit 104, i.e., only image information items, as microblog image information items 112 in the memory 11. [0050] The text-information classification unit 106 stores information items that have been obtained by removing image information items from the individual posted information items included in the microblog information items 111 acquired by the microblog-information acquisition unit 104, i.e., text information items, as microblog text information items 113 in the memory 11. [0051] The action taken/not-taken determination unit 107 determines, on the basis of the image classification model 114, for each of the microblog image information items 112, whether or not a user who posted an information item including the microblog image information item 112 took an action related to the search keyword accepted by the search keyword accepting unit 103, and stores an action taken/not-taken determination result 116 in the memory 11. Note that, when the microblog image information item 112 matches the image classification model 114, the action taken/not-taken determination unit 107 determines that the user took the action. Furthermore, for each of the microblog text information items 113, when the microblog text information item 113 does not match a text classification model 115 that is prepared, the action taken/not-taken determination unit 107 determines that the user did not take the action. Specific determination methods will be described below. Additionally, a second text classification model may be provided, and, for each of the microblog text information items 113, when the microblog text information item 113 matches the second text classification model, the action taken/not-taken determination unit 107 may determine that the user took the action. [0052] The positive/negative determination unit 108 determines, on the text classification model 115, for each of the microblog text information items 113, whether the content of a posted information item including the microblog text information item 113 is positive (favorable) or negative (unfavorable), and stores a positive/negative determination result 117 in the memory 11. Note that, the text classification model 115 used by the positive/negative determination unit 108 is learned from feature vectors, for example, each of which represents the presence/absence of individual words as an element in text items that each belong to a positive group or a negative group. The positive/negative determination unit 108 generates a - 12 feature vector similarly for a text information item that is a classification target, and compares the generated feature vector with the feature vectors obtained as a result of learning. Accordingly, the text information item that is a classification target is classified by determining whether the text information item belongs to the positive group or the negative group. Furthermore, the positive/negative determination unit 108 may determine, on the basis of each of the microblog image information items 112, whether the content of a posted information item including the microblog image information item 112 is positive or negative. [0053] The classification-result output unit 109 generates classification-result information items 118 from the action taken/not-taken determination results 116 and the positive/negative determination results 117, and outputs the classification-result information items 118 to an external apparatus, e.g., the terminal 3. [0054] The memory 11 stores the microblog classification program 110, the microblog information items 111, the microblog image information items 112, the microblog text information items 113, the image classification model 114, the text classification model 115, the action taken/not-taken determination results 116, the positive/negative determination results 117, the classification-result information items 118, and so forth. [0055] The microblog classification program 110 is a program that causes the controller 10 to operate as the above-described individual units 100 to 108. [0056] It is supposed that the image classification model 114 has, as different information items, information items used by the action taken/not-taken determination unit 107 and information items used by the positive/negative determination unit 108. Furthermore, it is supposed that, similarly, the text classification model 115 also has, as different information items, information items used by the action taken/not-taken determination unit 107 and information items used by the positive/negative determination unit 108. Note that the image classification model 114 is not limited to an image classification model generated by the image-classification-model generating unit 102. A model prepared in the memory 11 may be used as the image classification model 114, or a configuration in which a model prepared in an external unit is acquired as the image classification model 114 may be used. [0057] Fig. 3 is a schematic diagram illustrating an example of a configuration of the microblog information items 111. - 13 [0058] The microblog information items 111 have a user ID column, a microblog ID column, and a content column. In the user ID column, identifiers of users who posted information items that are referred to as posted information items are arranged. In the microblog ID column, for example, identifiers that are added in chronological order are arranged. In the content column, content items that are text items input as the posted information items, URLs of other servers in which images (still images or moving images) are stored and which are not illustrated, or the text items and the URLs are arranged. Note that, instated of the URLs arranged in the content column, information items concerning the still images or the moving images may be directly arranged in the content column. [0059] Note that, although each of the content items includes a time information item indicating a time at which the content item was posted, here, the time information item is omitted and the content item is displayed. [0060] Fig. 4 is a schematic diagram illustrating an example of a configuration of the microblog image information items 112. [0061] The microblog image information items 112 have a user ID column, a microblog ID column, and an image content column. The user ID column and the microblog ID column are same as the user ID column and the microblog ID column illustrated in Fig. 3. In the image content column, actual image information items stored in URLs which were input as posted information items are arranged. [0062] In other words, the microblog image information items 112 are obtained by removing, from the microblog information items 111, posted information items including only text items, and by acquiring image information items from URLs in which images are stored. [0063] Fig. 5 is a schematic diagram illustrating an example of a configuration of the microblog text information items 113. [0064] The microblog text information items 113 have a user ID column, a microblog ID column, and a text content column. The user ID column and the microblog ID column are the same as user ID column and the microblog ID column illustrated in Fig. 3. In the text content column, content items that are text items which were input as posted information items are arranged. - 14 [0065] In other words, the microblog text information items 113 are obtained by removing, from the microblog information items 111, posted information items including only URLs in which images are stored, and by removing URLs from the remaining posted information items. Operation of microblog classification system [0066] Next operations in the present exemplary embodiment are separately described as the following operations: (1) basic operation; (2) image-classification-model generating operation; (3) microblog-information acquiring operation; (4) action taken/not-taken determination operation; (5) positive/negative determination operation; and (6) classification result output operation. (1) Basic operation [0067] First, the user of the terminal 4 performs, on the terminal 4, an operation for transmitting an information item, which is to be referred to as a posted information item, to the microblog. Note that the following operation may be performed on the terminal 3. [0068] The terminal 4 transmits, to the microblog server 2, in accordance with the operation performed by the user, an information item which includes a text item, an image, or the like and which is to be referred to as a posted information item, thereby posting the information item on the microblog. [0069] The microblog server 2 receives the posted information item from the terminal 4, thereby accumulating the microblog information items. [0070] Furthermore, the user of the terminal 4 performs, on the terminal 4, an operation for viewing the microblog. [0071] The terminal 4 transmits, to the microblog server 2, in accordance with the operation performed by the user, a request to view the microblog information items. [0072] The microblog server 2 transmits the microblog information items to the terminal 4. [0073] When the terminal 4 receives the microblog information items from the microblog server 2, the terminal 4 displays, on the display, text items or images posted on the microblog. -15 [0074] Next, an operation for generating an image classification model will be described as an operation that is preparatory to classification of the microblog information items. (2) Image-classification-model generating operation [0075] Fig. 9 is a flowchart illustrating an example of an operation of the microblog classification server 1 in the case of generating the image classification model 114. [0076] First, the image list display unit 100 acquires, among the microblog information items acquired by the microblog-information acquisition unit 104, all of the microblog information items or a predetermined number of microblog information items (S1). The image list display unit 100 generates an information item for performing list display of image information items included in the microblog information items on the display of the terminal 3, and transmits the generated information item to the terminal 3 (S2). [0077] The terminal 3 receives the information item, and displays an image list display screen on the display. [0078] Fig. 6 is a schematic diagram illustrating an example of a configuration of an image list display screen that is displayed on the display of the terminal 3 by the image list display unit 100. [0079] In an image list display screen 103a, list display of multiple images 1121, 1122,--- is performed. Note that the image list display screen 103a may be displayed in multiple pages. [0080] Next, the user operates the operation unit of the terminal 3 with reference to the image list display screen 103a, thereby selecting certain images. The details of the operation are transmitted from the terminal 3 to the microblog classification server 1. Here, for example, it is supposed that the user selects images related to a keyword of "ABC fireworks display", i.e., images including fireworks, images including shop stands, images including people wearing yukatas that are Japanese garments, and so forth. [0081] Next, the related-image selection unit 101 of the microblog classification server 1 accepts the details of the operation. In a state in which the information item generated by the image list display unit 100 is displayed on the display of the terminal 3, the related-image selection unit 101 selects, in accordance with the operation performed by the user of the - 16 terminal 3, some images from the images that are being displayed (S3). The selected images are in a state of being selected using a selection frame 103b as illustrated in Fig. 6. [0082] Next, the image-classification-model generating unit 102 extracts feature values of the images selected by the related-image selection unit 101, and generates learning data items of the positive examples (S4). [0083] Next, the image-classification-model generating unit 102 extracts feature values of the images that are not selected by the related-image selection unit 101, and generates learning data items of the negative examples (S5). Note that classification of the images is not limited to classification into two groups that are the positive example and the negative example, and may be classification into multiple groups. [0084] Next, the image-classification-model generating unit 102 generates the image classification model 114 from the learning data items of the positive examples and the learning data items of the negative examples in relation to the search keyword ("ABC fireworks display") accepted by the search-keyword accepting unit 103 described below (S6), and stores the image classification model 114 in the memory 11 (S7). [0085] Meanwhile, in order to make a request of the microblog classification server 1 to classify the microblog information items, the user of the terminal 3 operates the operation unit of the terminal 3, thereby inputting a search keyword. The terminal 3 transmits, together with the request to classify the microblog information items, the search keyword to the microblog classification server 1. [0086] The microblog classification server 1 operates as follows in response to the request. (3) Microblog-information acquiring operation [0087] Fig. 10 is a flowchart illustrating an example of an operation of the microblog classification server 1 in the case of acquiring the microblog information items. [0088] First, the search-keyword accepting unit 103 accepts the search keyword of "ABC fireworks display" from the terminal 3 (S10). [0089] Next, the microblog-information acquisition unit 104 retrieves posted information items on the basis of the search keyword accepted by the search-keyword accepting unit 103 - 17 from the microblog information items stored in the microblog server 2 (S11). Note that, in the case of retrieving posted information items, posted information items completely including "ABC fireworks display" may be retrieved. In addition, an abbreviation of the search keyword or multilingual versions of the keyword may be used, or posted information items may be retrieved using multiple keywords. [0090] Next, the microblog-information acquisition unit 104 extracts user IDs of the posted information items which include the search keyword and which have been retrieved as search results (S12), and acquires, for each of the extracted user IDs, posted information items that were posted by the user ID (S13). Note that all posted information items that were posted by the user ID may be acquired, or, for each of the posted information items including the search keyword, posted information items that are within a predetermined chronological range with respect to the posted information item including the search keyword may be acquired. [0091] Next, when link URLs indicating links to image information items are included in the posted information items that have been acquired, the microblog-information acquisition unit 104 acquires the image information items stored in the link URLs (S15). [0092] Steps S13 to S15 described above are performed for all of the user IDs extracted in step S12. [0093] Next, the microblog classification server 1 operates as follows so as to determine whether or not an action related to the search keyword was taken. (4) Action taken/not-taken determination operation [0094] Fig. 11 is a flowchart illustrating an example of an operation of the microblog classification server 1 in the case of determining whether or not a user who posted information items that are referred to as posted information items took an action. [0095] First, the action taken/not-taken determination unit 107 acquires posted information items of a certain user ID (S20). [0096] Next, the image-information classification unit 105 considers only image information items of the certain user ID as the microblog image information items 112 (S22). The action taken/not-taken determination unit 107 determines whether or not each of images included in the microblog image information items 112 matches the positive example of the image -18 classification model 114 (S23 and S24). In other words, the action taken/not-taken determination unit 107 classifies some of the microblog image information items 112 as image information items related to the search keyword. As a result of classification, for each of the classified microblog image information items 112, when the classified microblog image information item 112 matches the positive example (YES in S23), the action taken/not-taken determination unit 107 determines that the user having the user ID "took the action" related to the search keyword (S24). Note that, in the case where the search keyword is "ABC fireworks display", the phrase "took the action" indicates, for example, that a user "attended to the ABC fireworks display". Here, it is supposed that "classification as image information items related to the search keyword" includes not only classification as image information items which match the image classification model 114 generated in "(2) image-classification-model generating operation", but also classification as image information items which match a model prepared in the memory 11 or a model prepared in an external unit. [0097] Note that, in step S24, in the case where, for the same user ID, the number of images that match the positive example is equal to or larger than a predetermined threshold, the action taken/not-taken determination unit 107 may determine that the user having the user ID "took the action". [0098] Next, when, in step S23, the classified microblog image information item 112 does not match the positive example (NO in S23), whether or not the user having the user ID took the action is still "unknown". Accordingly, next, classification based on text information items is performed. The text-information classification unit 106 considers, as the microblog text information items 113, only text information items included in the posted information items, which have been acquired by the action taken/not-taken determination unit 107, of the certain user ID (S26). The action taken/not-taken determination unit 107 determines whether or not each of the microblog text information items 113 matches the text classification model 115 (S27 and S28). When the microblog text information item 113 does not match the text classification model 115 (NO in S27), the action taken/not-taken determination unit 107 determines that the user having the user ID did "not take the action" (S29). Note that, in the case where the search keyword is "ABC fireworks display", the phrase "not take the action" indicates, for example, that a user did "not attend to the ABC fireworks display". [0099] Note that determination of whether or not the action was taken, which is described above, is performed for all user IDs (S30), and determination results are stored as the action taken/not-taken determination results 116 in the memory 11. -19 [0100] Next, the microblog classification server 1 operates as follows so as to determine whether the content of a posted information item is positive or negative. (5) Positive/negative determination operation [0101] Fig. 12 is a flowchart illustrating an example of an operation of the microblog classification server 1 in the case of determining whether the content of a posted information item is positive or negative. [0102] First, the positive/negative determination unit 108 acquires posted information items of a certain user ID (S40). [0103] The text-information classification unit 106 considers, as the microblog text information items 113, only text information items included in the posted information items, which have been acquired by the positive/negative determination unit 108, of the certain user ID (S41). The positive/negative determination unit 108 determines whether the content of each of the microblog text information items 113 is positive or negative (S42 and S43). Note that, in the case where the search keyword is "ABC fireworks display", the term "positive" indicates, for example, a favorable opinion about the "ABC fireworks display", and the term "negative" indicates, for example, an unfavorable opinion about the "ABC fireworks display". [0104] Note that determination of whether or not the content of a text information item is positive or negative, which is described above, is performed for all user IDs (S44), and determination results are stored as the positive/negative determination results 117 in the memory 11. (6) Classification-result output operation [0105] Next, the classification-result output unit 109 generates the classification-result information items 118 from the action taken/not-taken determination results 116 and the positive/negative determination results 117, and transmits the classification-result information items 118 to the terminal 3. [0106] Fig. 7 is a schematic diagram illustrating an example of a configuration of the classification-result information items 118. - 20 [0107] The classification-result information items 118 have a user ID column, a microblog ID column, a content column, an image content column, an action taken/not-taken determination result column, and a positive/negative determination result column. The user ID column, the microblog ID column, and the content column are the user ID column, the microblog ID column, and the content column illustrated in Fig. 3, which are provided as common columns. The image content column is the image content column illustrated in Fig. 5, which is provided as a common column. In the action taken/not-taken determination result column, the determination results obtained by the action taken/not-taken determination unit 107 are arranged. In the positive/negative determination result column, the determination results obtained by the positive/negative determination unit 108 are arranged. [0108] Note that determination of whether or not an action was taken is performed for each user ID, and determination of whether the content of a text information item is positive or negative is performed for each posted information item. [0109] Furthermore, the terminal 3 may receive the classification-result information items 118, and may display the following information items on the display. [0110] Fig. 8 is a schematic diagram illustrating an example of a configuration of a display screen that is output by the classification-result output unit 109 on the basis of the classification-result information items 118. [0111] A classification-result-information display screen 108a includes content items 108b 1 and an image 108c 1 , content items 108b 2 , content items 108b 3 and an image 108c 3 , and content items 108b 4 . The content items 108b 1 and the image 108c 1 are posted information items, and each of the posted information items indicates, for the action, that a user "attended the fireworks display" and the content of the posted information item is "positive". The content items 108b 2 are posted information items, and each of the posted information items indicates, for the action, that a user did "not attend the fireworks display" and the content of the posted information item is "positive". The content items 108b 3 and the image 108c 3 are posted information items, and each of the posted information items indicates, for the action, that a user "attended the fireworks display" and the content of the posted information item is "negative". The content items 108b 4 are posted information items, and each of the posted information item indicates, for the action, that a user did "not attend the fireworks display" and the content of the posted information item is "negative". - 21 Advantages of exemplary embodiment [0112] In the foregoing exemplary embodiment, each of the microblog image information items 112 is classified on the basis of the image classification model 114, and whether or not a user who posted an information item including the classified image information item took an action related to a search keyword is determined. Additionally, each of the microblog text information items 113 is classified on the basis of the text classification model 115, and whether or not a user who posted an information item including the classified text information item took the action related to the search keyword is determined. Accordingly, a posted information item may be classified by determining whether or not a user who posted the information item took the action. [0113] Furthermore, each of the microblog text information items 113 included in posted information items is classified on the basis of the text classification model 115, and whether the content of a posted information item including the classified text information item is favorable or unfavorable is determined. Accordingly, a posted information item may be classified by determining whether the content of the posted information item is positive or negative. [0114] Moreover, classification results are displayed as in the classification-result information display screen 108a illustrated in Fig. 8. Accordingly, for example, the attractive points of the "ABC fireworks display" may be extracted from the content items 108b 1 and the image 108c 1 . For example, points for causing users to attend the "ABC fireworks display" may be extracted from the content items 108b 1 and the image 108c 1 , and the content items 108b 2 . Improvement points of the "ABC fireworks display" may be extracted from the content items 108b 3 and the image 108c 3 . Points for improving the impression of the "ABC fireworks display" may be extracted from the content items 108b 4 . Furthermore, a statistical information item concerning the classification-result information items 118 illustrated in Fig. 7 may be generated and output. Moreover, distribution of information items to individual users on the basis of the classification-result information items 118 may be performed. More specifically, for example, advertisements are distributed to users who attended the "ABC fireworks display" and users who did not attend the "ABC fireworks display" in such a manner that the content of the advertisement distributed to the users who attended the "ABC fireworks display" and the content of the advertisement distributed to the users who did not attend the "ABC fireworks display" are different from each other. - 22 Other Exemplary Embodiments [0115] Note that the present invention is not limited to the foregoing exemplary embodiment, and various modifications may be made without departing from the scope of the present invention. For example, the microblog is not limited to Twitter (registered trademark), and any type of medium may be used if the medium is a medium on which comparatively short text items are posted, in which text information items and image information items (including still images, moving images, and link destination information items concerning links to information items concerning the still images or moving images) are mixed, and in which a large number of text information items and image information items are displayed in chronological order, such as Facebook (registered trademark). Furthermore, messages or the like of mail may be targets to be processed. [0116] In the foregoing exemplary embodiment, the functions of the individual units 100 to 108 included in the controller 10 are realized by a program. However, all or some of the individual units may be realized by hardware such as an application-specific integrated circuit (ASIC). Furthermore, the program used in the foregoing exemplary embodiment may be stored on a recording medium, such as a compact disc read-only memory (CD-ROM), and supplied. Moreover, the steps described in the foregoing exemplary embodiment may be, for example, replaced, removed, or added without changing the scope of the present invention. [0117] The foregoing description of the exemplary embodiments of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in the art. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, thereby enabling others skilled in the art to understand the invention for various embodiments and with the various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalents. [0118] In the claims which follow and in the preceding description of the invention, except where the context requires otherwise due to express language or necessary implication, the word "comprise" or variations such as "comprises" or "comprising" is used in an inclusive sense, i.e. to specify the presence of the stated features but not to preclude the presence or addition of further features in various embodiments of the invention.
权利要求:
Claims (10) [1] 1. An information classification program comprising: accepting a search keyword; retrieving, from a plurality of information items posted by a plurality of users, a posted information item including the accepted search keyword, each of the plurality of posted information items including at least either of a text information item and an image information item, and acquiring posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword; and classifying, as image information items related to the search keyword, some of image information items included in the posted information items that have been acquired, and performing first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword. [2] 2. The information classification program according to Claim 1, wherein, in the acquiring, a posted information item including the search keyword is retrieved from the plurality of posted information items, and, among other posted information items of a user who posted the information item including the search keyword, posted information items that are within a predetermined chronological range with respect to the posted information item including the search keyword are acquired. [3] 3. The information classification program according to Claim 1 or 2, wherein, in the classifying and performing first determination, text information items included in the posted information items that have been acquired are classified, and, for each of the classified text information items, whether or not a user who posted an information item including the classified text information item took the action related to the search keyword is determined. [4] 4. The information classification program according to any one of Claims 1 to 3, the process further comprising classifying text information items included in the posted information items that have been acquired, and, performing second determination of, for each of the classified text information items, whether the content of a posted information item including the classified text information item is favorable or unfavorable. - 24 [5] 5. The information classification program according to Claim 4, wherein, in the classifying and performing second determination, image information items included in the posted information items that have been acquired are classified, and, for each of the classified image information items, whether the content of a posted information item including the classified image information item is favorable or unfavorable is determined. [6] 6. An information classification method comprising: accepting a search keyword; retrieving, from a plurality of information items posted by a plurality of users, a posted information item including the accepted search keyword, each of the plurality of posted information items including at least either of a text information item and an image information item, and acquiring posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword; and classifying, as image information items related to the search keyword, some of image information items included in the posted information items that have been acquired, and performing first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword. [7] 7. An information processing apparatus comprising: an accepting unit that accepts a search keyword; an acquisition unit that retrieves, from a plurality of information items posted by a plurality of users, a posted information item including the search keyword accepted by the accepting unit, each of the plurality of posted information items including at least either of a text information item and an image information item, and that acquires posted information items which are within a predetermined chronological range with respect to the posted information item including the search keyword; and a first determination unit that classifies, as image information items related to the search keyword, some of image information items included in the posted information items acquired by the acquisition unit, and that performs first determination of, for each of the classified image information items, whether or not a user who posted an information item including the classified image information item took an action related to the search keyword. - 25 [8] 8. An information classification program substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples. [9] 9. An information classification method substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples. [10] 10. An information processing apparatus substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples.
类似技术:
公开号 | 公开日 | 专利标题 US9830386B2|2017-11-28|Determining trending topics in social media US8650177B2|2014-02-11|Skill extraction system US9904681B2|2018-02-27|Method and apparatus for assembling a set of documents related to a triggering item EP2874076A1|2015-05-20|Generalized graph, rule, and spatial structure based recommendation engine US10229190B2|2019-03-12|Latent semantic indexing in application classification JP2015135668A|2015-07-27|Computing devices and methods of connecting people based on content and relational distance WO2016014124A1|2016-01-28|Determining suggested facets JPWO2012096388A1|2014-06-09|Unexpectedness determination system, unexpectedness determination method, and program WO2017013667A1|2017-01-26|Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof AU2013201006B2|2014-07-03|Information classification program, information classification method, and information processing apparatus JP2014149713A|2014-08-21|Image evaluation device EP2613275B1|2017-11-22|Search device, search method, search program, and computer-readable memory medium for recording search program AU2013201018B2|2014-08-28|Information classification program, information classification method, and information processing apparatus US10671660B2|2020-06-02|Contextual ranking of photos and text in search CN104462083A|2015-03-25|Content comparison method and device and information processing system JP5767413B1|2015-08-19|Information processing system, information processing method, and information processing program CN111522940A|2020-08-11|Method and device for processing comment information JPWO2016103519A1|2017-04-27|Data analysis system, data analysis method, and data analysis program JP2013257747A|2013-12-26|Free time estimation device, method and program JP2013020462A|2013-01-31|Device and method for calculating degree of association JP2009140043A|2009-06-25|Information processing device and method, program, and recording medium JP2020129350A|2020-08-27|Information processing device, information processing method, and information processing program JP5246309B2|2013-07-24|Information processing apparatus and information processing program US20200042565A1|2020-02-06|Triggering personalized search queries based on physiological and behavioral patterns Datar et al.2016|A novel approach for polarity determination using emoticons: emoticon-graph
同族专利:
公开号 | 公开日 US10185765B2|2019-01-22| JP2014052809A|2014-03-20| JP5895777B2|2016-03-30| AU2013201006B2|2014-07-03| US20140067809A1|2014-03-06|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题 JP2004118573A|2002-09-26|2004-04-15|Fuji Photo Film Co Ltd|Image arranging device and its program| US8489132B2|2005-09-21|2013-07-16|Buckyball Mobile Inc.|Context-enriched microblog posting| US8527594B2|2007-02-16|2013-09-03|Ecairn, Inc.|Blog advertising| US8538184B2|2007-11-06|2013-09-17|Gruntworx, Llc|Systems and methods for handling and distinguishing binarized, background artifacts in the vicinity of document text and image features indicative of a document category| US9892103B2|2008-08-18|2018-02-13|Microsoft Technology Licensing, Llc|Social media guided authoring| KR101088787B1|2009-03-12|2011-12-02|조정열|Issue Analyzing System and Issue Data Generation Method| KR101088710B1|2009-04-20|2011-12-01|건국대학교 산학협력단|Method and Apparatus for Online Community Post Searching Based on Interactions between Online Community User and Computer Readable Recording Medium Storing Program thereof| WO2010139091A1|2009-06-03|2010-12-09|Google Inc.|Co-selected image classification| US20120066303A1|2010-03-03|2012-03-15|Waldeck Technology, Llc|Synchronized group location updates| KR20110110391A|2010-04-01|2011-10-07|가톨릭대학교 산학협력단|A visual communication method in microblog| US8332392B2|2010-06-30|2012-12-11|Hewlett-Packard Development Company, L.P.|Selection of items from a feed of information| EP2593873A4|2010-07-16|2014-08-13|First Wave Technology Pty Ltd|Methods and systems for analysis and/or classification of information| JP2012043002A|2010-08-12|2012-03-01|Nec Corp|Activity promoting system, server, method for promoting activity and program| US8732584B2|2010-11-09|2014-05-20|Palo Alto Research Center Incorporated|System and method for generating an information stream summary using a display metric| KR20120064581A|2010-12-09|2012-06-19|한국전자통신연구원|Mehtod of classfying image and apparatus for the same| US20120278253A1|2011-04-29|2012-11-01|Gahlot Himanshu|Determining sentiment for commercial entities| US10127522B2|2011-07-14|2018-11-13|Excalibur Ip, Llc|Automatic profiling of social media users| JP2013037624A|2011-08-10|2013-02-21|Sony Computer Entertainment Inc|Information processing system, information processing method, program, and information storage medium| KR101903717B1|2011-08-24|2018-10-04|한국전자통신연구원|Method and apparatus for auto extracting information of product| US9275041B2|2011-10-24|2016-03-01|Hewlett Packard Enterprise Development Lp|Performing sentiment analysis on microblogging data, including identifying a new opinion term therein| US9152625B2|2011-11-14|2015-10-06|Microsoft Technology Licensing, Llc|Microblog summarization| US20130159277A1|2011-12-14|2013-06-20|Microsoft Corporation|Target based indexing of micro-blog content|JP6335022B2|2014-05-23|2018-05-30|株式会社タレンティオ|Specific information collection device, posted information analysis device, specific information collection method, and program| EP2950224A1|2014-05-28|2015-12-02|Thomson Licensing|Annotation display assistance device and method of assisting annotation display| JP6383605B2|2014-08-20|2018-08-29|Kddi株式会社|Habit evaluation method, apparatus and program|
法律状态:
2014-10-30| FGA| Letters patent sealed or granted (standard patent)| 2021-09-09| HB| Alteration of name in register|Owner name: FUJIFILM BUSINESS INNOVATION CORP. Free format text: FORMER NAME(S): FUJI XEROX CO., LTD. |
优先权:
[返回顶部]
申请号 | 申请日 | 专利标题 JP2012-196417||2012-09-06|| JP2012196417A|JP5895777B2|2012-09-06|2012-09-06|Information classification program and information processing apparatus| 相关专利
Sulfonates, polymers, resist compositions and patterning process
Washing machine
Washing machine
Device for fixture finishing and tension adjusting of membrane
Structure for Equipping Band in a Plane Cathode Ray Tube
Process for preparation of 7 alpha-carboxyl 9, 11-epoxy steroids and intermediates useful therein an
国家/地区
|