![]() Information classification program, information classification method, and information processing ap
专利摘要:
Abstract An information classification program (110) includes the following: acquiring multiple posted information items (111), each of the multiple posted information items (111) including at least either of a text information item (112) and an image information item (113); generating text information items (112) including multiple text items in such a manner that image information items (113) are removed from the multiple posted information items (111), and individually classifying the text items included in the text information items (112) into first categories; generating image information items (113) including multiple images in such a manner that text information items (113) are removed from the multiple posted information items (111), and individually classifying the images included in the image information items (113) into second categories; associating the classified text items (114) and the classified images (115) with each other on the basis of the first and second categories to obtain results (116); and outputting the text items and the images, which have been associated with each other, for each of the results. ( START MICROBLOG INFORMATION ITEMS S2 , , r__4 EXTRACT MICROBLOG EXTRACT MICROBLOG IMAGE TEXT INFORMATION ITEMS FROM INFORMATION ITEMS FROM MICROBLOG INFORMATION ITEMS MICROBLOG INFORMATION ITEMS S31 "F "__S5 CLASSIFY CLASSIFY MICROBLOG IMAGE MICROBLOG TEXT INFORMATION INFORMATION ITEMS INTO ITEMS INTO CATEGORIES CATEGORIES ASSOCIATE MICROBLOG TEXT INFORMATION ITEMS AND MICROBLOG IMAGE INFORMATION ITEMS WITH EACH OTHER ON BASIS OF CATEGORIES S71 _IF OUTPUT MICROBLOG-INFORMATION CLASSIFICATION RESULTS 公开号:AU2013201018A1 申请号:U2013201018 申请日:2013-02-22 公开日:2014-02-06 发明作者:Motofumi Fukui;Keigo Hattori;Noriji Kato;Yasuhide Miura;Tomoko Okuma 申请人:Fuji Xerox Co Ltd; IPC主号:G06F17-21
专利说明:
-2 INFORMATION CLASSIFICATION PROGRAM, INFORMATION CLASSIFICATION METHOD, AND INFORMATION PROCESSING APPARATUS DESCRIPTION Background (i) Technical Field [0001] The present invention relates to an information classification program, an information classification method, and an information processing apparatus. (ii) Related Art [0002] Any discussion of the prior art throughout the specification should in no way be considered as an admission that such prior art is widely known or forms part of common general knowledge in the field. [0003] In the related art, an information processing apparatus has been proposed, which displays, for a user who has input a text item, an advertisement or the like corresponding to a term included in the input text item (for example, see Japanese Unexamined Patent Application Publication No. 2009-193133). [0004] In a microblog in which multiple text information items and image information items that were posted are mixed and displayed in chronological order, such as Twitter (registered trademark), the information processing apparatus disclosed in Japanese Unexamined Patent Application Publication No. 2009-193133 acquires a text item input by a certain user, extracts a term from the acquired text item by using natural language processing, classifies the user on the basis of the extracted term, and shows the user an advertisement or the like corresponding to the term. Summary [0005] It is an object of the present invention to provide an information classification program, an information classification method, and an information processing apparatus that classify posted information items including text information items and image information items, -3 each of the posted information items having any ratio between the number of text information items and the number of image information items. [0006] In order to achieve the above-mentioned object, according to aspects of the present invention, there are provided an information classification program, an information classification method, and an information processing apparatus. [0007] According to a first aspect, there is provided an information classification program. The information classification program includes the following: acquiring multiple posted information items, each of the multiple posted information items including at least either of a text information item and an image information item; generating text information items including multiple text items in such a manner that image information items are removed from the multiple posted information items, and individually classifying the text items included in the text information items into first categories; generating image information items including multiple images in such a manner that text information items are removed from the multiple posted information items, and individually classifying the images included in the image information items into second categories; associating the classified text items and the classified images with each other on the basis of the first and second categories to obtain results; and outputting the text items and the images, which have been associated with each other, for each of the results. [0008] According to a second aspect, in the information classification program according to the first aspect, in the outputting, the text items and the images are output independently of each other for each of the results. [0009] According to a third aspect, in the information classification program according to the first or second aspect, in the associating, association information items in which the first categories and the second categories are associated with each other using overall categories are provided, and the classified text items and the classified images are associated with each other on the basis of the first categories, the second categories, and the overall categories included in the association information items. [0010] According to a fourth aspect, there is provided an information classification method. The information classification method includes the following: acquiring multiple posted information items, each of the multiple posted information items including at least either of a text information item and an image information item; generating text information items -4 including multiple text items in such a manner that image information items are removed from the multiple posted information items, and individually classifying the text items included in the text information items into first categories; generating image information items including multiple images in such a manner that text information items are removed from the multiple posted information items, and individually classifying the images included in the image information items into second categories; associating the classified text items and the classified images with each other on the basis of the first and second categories to obtain results; and outputting the text items and the images, which have been associated with each other, for each of the results. [0011] According to a fifth aspect, there is provided an information processing apparatus. The information processing apparatus includes an acquisition unit, a text classification unit, an image classification unit, an associating unit, and an output unit. The acquisition unit acquires multiple posted information items. Each of the multiple posted information items includes at least either of a text information item and an image information item. The text classification unit generates text information items including multiple text items in such a manner that image information items are removed from the multiple posted information items, and individually classifies the text items included in the text information items into first categories. The image classification unit generates image information items including multiple images in such a manner that text information items are removed from the multiple posted information items, and individually classifies the images included in the image information items into second categories. The associating unit associates the classified text items and the classified images with each other on the basis of the first and second categories to obtain results. The output unit outputs the text items and the images, which have been associated with each other by the associating unit, for each of the results. [0012] According to each of the first, fourth, and fifth aspects of the present invention, the posted information items including text information items and image information items, each of the posted information items having any ratio between the number of text information items and the number of image information items, can be classified. [0013] According to the second aspect of the present invention, the text items and the images can be output independently of each other for each of the results. [0014] According to the third aspect of the present invention, the first categories and the second categories are associated with each other using the overall categories, and the text -5 items classified into the first categories and the images classified into the second categories can be associated with each other. Brief Description of the Drawings [0015] Exemplary embodiments of the present invention will be described in detail based on the following figures, wherein: [0016] Fig. 1 is a block diagram illustrating an example of a configuration of a microblog classification system according to a first exemplary embodiment of the present invention; [0017] Fig. 2 is a block diagram illustrating an example of a configuration of a microblog classification server; [0018] Fig. 3 is a schematic diagram illustrating an example of a configuration of microblog information items; [0019] Fig. 4 is a schematic diagram illustrating an example of a configuration of microblog text information items; [0020] Fig. 5 is a schematic diagram illustrating an example of a configuration of microblog image information items; [0021] Fig. 6 is a schematic diagram illustrating an example of a configuration of microblog text-information classification results; [0022] Fig. 7 is a schematic diagram illustrating an example of a configuration of microblog image-information classification results; [0023] Fig. 8 is a schematic diagram illustrating an example of a configuration of microblog information classification results; [0024] Fig. 9 is a flowchart illustrating an example of an operation of the microblog classification system; -6 [0025] Fig. 10 is a schematic diagram illustrating an example of a display screen that is obtained by performing a display process on a webpage information item generated on the basis of the microblog-information classification results; [0026] Fig. 11 is a block diagram illustrating an example of a configuration of a microblog classification server according to a second exemplary embodiment; and [0027] Fig. 12 is a schematic diagram illustrating an example of a configuration of category association information items. Detailed Description First Exemplary Embodiment Configuration of microblog classification system [0028] Fig. 1 is a schematic diagram illustrating an example of a configuration of a microblog classification system according to a first exemplary embodiment of the present invention. [0029] A microblog classification system 6 includes a microblog classification server 1, a microblog server 2, a web server 3, and a terminal 4, and connects, using a network 5, the individual apparatuses so that the apparatuses are able to communicate with each other. Here, a microblog is a medium in which multiple text information items and image information items that were posted (transmitted) are mixed and displayed in chronological order. More specifically, microblog information items stored in the microblog server 2 are subjected to a display process by an information processing apparatus such as the terminal 4, whereby the microblog is displayed. Hereinafter, the unit of an information item posted on the microblog is referred, for simplicity, to as a "posted information item". It is supposed that a posted information item includes a text information item and an image information item, includes only a text information item, or includes only an image information item. In other words, each of the microblog information items includes multiple posted information items. [0030] The microblog classification server 1 is an information processing apparatus that includes electronic components such as a central processing unit (CPU) having functions for processing information items and a memory. The microblog classification server 1 acquires the microblog information items from the microblog server 2, and classifies multiple text -7 information items and image information items that are included in posted information items into individual categories. [0031] The microblog server 2 is an information processing apparatus that includes electronic components such as a CPU having functions for processing information items and a memory. The microblog server 2 accepts text information items such text items and/or image information items concerning still images such as photographs or moving images, which have been transmitted from the terminal 4 or the like and which are to be referred to as posted information items, and generates microblog information items for displaying the posted information items in chronological order. Moreover, when the microblog server 2 accepts, from the terminal 4, a request to view the microblog information items, the microblog server 2 transmits the microblog information items to the terminal 4. Note that it is supposed that an image information item included in a posted information item directly includes an information item concerning a still image or a moving image or includes a link destination information item concerning a link destination in which an information item concerning a still image or a moving image is stored. Furthermore, a text information item included in a posted information item may directly include an information item concerning a text item or may include a link destination information item concerning a link destination in which an information item concerning a text file, a hypertext markup language (HTML) file, or the like is stored. [0032] The web server 3 is an information processing apparatus that includes electronic components such as a CPU having functions for processing information items and a memory. The web server 3 stores, in the memory, webpage information items for displaying webpages such as HTML files. When the web server 3 receives a request to view a webpage from the terminal 4, the web server 3 transmits a webpage information item to the terminal 4. Note that a webpage information item is generated on the basis of classification result information items generated by the microblog classification server 1 as described below. [0033] The terminal 4 includes an operation unit such as a touch panel used to input an instruction for an operation, a display such as a liquid crystal display provided under the touch panel, and a controller having electronic components such as a CPU and a memory. The terminal 4 transmits an information item, which is to be referred to as a posted information item, such as a text item or an image, to the microblog server 2 in accordance with an operation performed by a user, thereby posting the information item on the microblog. Moreover, the terminal 4 transmits, to the microblog server 2, in accordance with an operation performed by the user, a request to view the microblog. When the terminal 4 receives the -8 microblog information items from the microblog server 2 as a result of the request to view the microblog, the terminal 4 displays, on the display, text items or images (still images or moving images) included in posted information items of the microblog. [0034] Furthermore, the terminal 4 transmits, in accordance with an operation performed by the user, to the web server 3, a request to view a webpage. As a result of the request to view a webpage, when the terminal 4 receives a webpage information item corresponding to the webpage from the web server 3, the terminal 4 performs a display process on the webpage information item, and displays the webpage on the display. [0035] Note that the terminal 4 is, for example, a mobile phone. Alternatively, a personal digital assistant (PDA), a personal computer, or the like may be used as the terminal 4. Furthermore, although one terminal 4 is illustrated in Fig. 1, the number of terminals 4 may be two or more. [0036] The network 5 is a communication network such as the Internet or a local area network (LAN), regardless of whether the network 5 is a wired network or wireless network. Configuration of microblog classification server [0037] Fig. 2 is a block diagram illustrating an example of a configuration of the microblog classification server 1. [0038] The microblog classification server 1 includes a controller 10, a memory 11, and a communication section 12. The controller 10 is constituted by a CPU or the like, and controls individual units and executes various types of programs. The memory 11 is provided as an example of a storage device that is constituted by a recording medium such as a hard disk drive (HDD) or a flash memory and that stores information items. The communication section 12 communicates with an external apparatus via the network 5. [0039] The controller 10 executes a microblog classification program 110, which is described below, thereby functioning as a microblog-information acquisition unit 100, a text information classification unit 101, an image-information classification unit 102, a category associating unit 103, a classification-result output unit 104, and so forth. [0040] The microblog-information acquisition unit 100 acquires microblog information items 111 from the microblog server 2, and stores the microblog information items 111 in the -9 memory 11. Note that all of the microblog information items accumulated in the microblog server 2 may be acquired as the microblog information items 111, or microblog information items that have been obtained by filtering using a predetermined time period, a predetermined keyword, or the like may be acquired as the microblog information items 111. A method for acquiring the microblog information items 111 will be described below. [0041] The text-information classification unit 101 considers information items that have been obtained by removing image information items from individual posted information items included in the microblog information items 111 acquired by the microblog-information acquisition unit 100, i.e., only text information items, as microblog text information items 112. Further, the individual text information items included in the microblog text information items 112 are classified into categories to obtain results, and stores the results as microblog-text information classification results 114 in the memory 11. [0042] Note that, as an example of a classification method, the text-information classification unit 101 determines whether or not each of multiple words is present in text items that belong to the individual categories, and generates a feature vector having the presence/absence of the word as an element, thereby performing learning in advance. The text-information classification unit 101 generates a feature vector similarly for a text information item that is a classification target, and compares the generated feature vector with the feature vector obtained as a result of learning. Accordingly, the text information item that is a classification target is classified by determining whether the text information item belongs to any one of the categories. [0043] The image-information classification unit 102 considers information items that have been obtained by removing text information items from individual posted information items included in the microblog information items 111 acquired by the microblog-information acquisition unit 100, i.e., only image information items, as microblog image information items 113. Further, the image-information classification unit 102 classifies individual images included in the microblog image information items 113 into categories to obtain results, and stores the results as the microblog-image-information classification results 115 in the memory 11. [0044] Note that, as an example of a classification method, the image-information classification unit 102 generates feature information items of images related to the individual categories, thereby performing learning in advance. The image-information classification unit - 10 102 similarly generates a feature information item of an image information item that is a classification target, and compares the generated feature information item with the feature information item obtained as a result of learning. Accordingly, the image information item that is a classification target is classified by determining whether the image information item belongs to any one of the categories. [0045] The category associating unit 103 associates the microblog-text-information classification results 114 and the microblog-image-information classification results 115 with each other on the basis of the categories into which the microblog-text-information classification results 114 have been classified by the text-information classification unit 101 and the categories into which the microblog-image-information classification results 115 have been classified by the image-information classification unit 102, and stores the classification results as microblog-information classification results 116 in the memory 11. [0046] The classification-result output unit 104 outputs the microblog-information classification results 116 to an external apparatus, e.g., the web server 3. [0047] The memory 11 stores the microblog classification program 110, the microblog information items 111, the microblog text information items 112, the microblog image information items 113, the microblog-text-information classification results 114, the microblog image-information classification results 115, the microblog-information classification results 116, and so forth. [0048] The microblog classification program 110 is a program that causes the controller 10 to operate as the above-described individual units 100 to 104. [0049] Fig. 3 is a schematic diagram illustrating an example of a configuration of the microblog information items 111. [0050] The microblog information items 111 have a microblog ID column and a content column. In the microblog ID column, for example, identifiers that are added in chronological order are arranged. In the content column, content items that are text items input as posted information items, URLs of other servers in which images (still images or moving images) are stored and which are not illustrated, or the text items and the URLs are arranged. Note that, instated of the URLs arranged in the content column, information items concerning the still images or the moving images may be directly arranged in the content column. - 11 [0051] Note that, although each of the content items is an information item posted by a user having a user ID and includes a time information item indicating a time at which the content item was posted, here, the user ID and the time information item are omitted and the content item is displayed. [0052] Fig. 4 is a schematic diagram illustrating an example of a configuration of the microblog text information items 112. [0053] The microblog text information items 112 have a microblog ID column and a text content column. The microblog ID column is the microblog ID column illustrated in Fig. 3, which is provided as a common column. In the text content column, content items that are text items which were input as posted information items are arranged. [0054] In other words, the microblog text information items 112 are obtained by removing, from the microblog information items 111, posted information items including only URLs in which images are stored, and by removing, from posted information items including URLs, the URLs. [0055] Fig. 5 is a schematic diagram illustrating an example of a configuration of the microblog image information items 113. [0056] The microblog image information items 113 have a microblog ID column and an image content column. The microblog ID column is the microblog ID column illustrated in Fig. 3, which is provided as a common column. In the image content column, actual image information items stored in URLs which were input as posted information items are arranged. [0057] In other words, the microblog image information items 113 are obtained by removing, from the microblog information items 111, posted information items including only text items, and by acquiring image information items from URLs in which images are stored. [0058] Fig. 6 is a schematic diagram illustrating an example of a configuration of the microblog-text-information classification results 114. [0059] The microblog-text-information classification results 114 have a microblog ID column, a text content column, and a category column. The microblog ID column is the microblog ID column illustrated in Fig. 3, which is provided as a common column. The text content column is the text content column illustrated in Fig. 4, which is provided as a common column. In the - 12 category column, category names that are obtained as results of classification of text items arranged in the text content column are arranged. [0060] Fig. 7 is a schematic diagram illustrating an example of a configuration of the microblog-image-information classification results 115. [0061] The microblog-image-information classification results 115 have a microblog ID column, an image content column, and a category column. The microblog ID column is the microblog ID column illustrated in Fig. 3, which is provided as a common column. The image content column is the image content column illustrated in Fig. 5, which is provided as a common column. In the category column, category names that are obtained as results of classification of image information items arranged in the image content column are arranged. [0062] Fig. 8 is a schematic diagram illustrating an example of a configuration of the microblog-information classification results 116. [0063] The microblog-information classification results 116 have a microblog ID column, an image content column, a text content column, and a category column. The microblog ID column is the microblog ID column illustrated in Fig. 3, which is provided as a common column. The image content column is the image content column illustrated in Fig. 5, which is provided as a common column. The text content column is the text content column illustrated in Fig. 4, which is provided as a common column. The category column is the category column illustrated in Figs. 6 and 7, which is provided as a common column. Operation of microblog classification system [0064] Next operations in the present exemplary embodiment are separately described as the following operations: (1) basic operation; (2) microblog-information acquiring operation; (3) microblog classification operation; and (4) classification-result output operation. (1) Basic operation [0065] First, the user of the terminal 4 performs, on the terminal 4, an operation for transmitting an information item, which is to be referred to as a posted information item, to the microblog. - 13 [0066] The terminal 4 transmits, to the microblog server 2, in accordance with the operation performed by the user, an information item which includes a text item, an image, or the like and which is to be referred to as a posted information item, thereby posting the information item on the microblog. [0067] The microblog server 2 receives the posted information item from the terminal 4, thereby accumulating the microblog information items. [0068] Furthermore, the user of the terminal 4 performs, on the terminal 4, an operation for viewing the microblog. [0069] The terminal 4 transmits, to the microblog server 2, in accordance with the operation performed by the user, a request to view the microblog information items. [0070] The microblog server 2 transmits the microblog information items to the terminal 4. [0071] When the terminal 4 receives the microblog information items from the microblog server 2, the terminal 4 displays, on the display, text items or images posted on the microblog. [0072] Meanwhile, the administrator of the web server 3 makes a request to the microblog classification server 1 for information items that are to be used as materials for generating a webpage which is to be placed on the web server 3. Note that the web server 3 may regularly make the request to the microblog classification server 1. [0073] The microblog classification server 1 operates as follows in response to the request. [0074] Fig. 9 is a flowchart illustrating an example of an operation of the microblog classification system. (2) Microblog-information acquiring operation [0075] First, the microblog-information acquisition unit 100 acquires, from the microblog server 2, the microblog information items 111 that are targets (S1). [0076] The microblog-information acquisition unit 100 may acquire all of the microblog information items stored in the microblog server 2 as the microblog information items 111. However, the microblog-information acquisition unit 100 may acquire some of the microblog - 14 information items, whereby the processing load for processes that are described below may be reduced. A method for acquiring some of the microblog information items is, for example, as follows. [0077] First, the microblog-information acquisition unit 100 specifies multiple keywords related to the content of the webpage that is to be placed on the web server 3. [0078] Next, the microblog-information acquisition unit 100 retrieves a posted information item including one keyword selected from among the multiple keywords. [0079] Next, the microblog-information acquisition unit 100 identifies a user who posted the information item that is the posted information item which has been retrieved, and retrieves a posted information item including another one of the specified keywords from posted information items that were posted in a predetermined time period among posted information items of the identified user. Next, a series of posted information items starting with the posted information item including the selected one keyword and ending with the posted information item including the other keyword is acquired. [0080] By acquiring some of the microblog information items stored in the microblog server 2 as described above, a series of posted information items transmitted by a certain user for a certain keyword may be acquired. (3) Microblog classification operation [0081] Next, the text-information classification unit 101 considers information items that have been obtained by removing image information items from posted information items included in the microblog information items 111 illustrated in Fig. 3, i.e., only text information items, as the microblog text information items 112 illustrated in Fig. 4 (S2). In the example illustrated in Fig. 3, URLs that are destinations in which image information items are stored are removed. [0082] Furthermore, the text-information classification unit 101 classifies the individual text information items included in the microblog text information items 112 illustrated in Fig. 4 into the categories to obtain results, and stores the results as the microblog-text-information classification results 114 in the memory 11 as illustrated in Fig. 6 (S3). -15 [0083] Meanwhile, the image-information classification unit 102 considers information items that have been obtained by removing text information items from the microblog information items 111 which have been acquired by the microblog-information acquisition unit 100 and which are illustrated in Fig. 3, i.e., only image information items, as the microblog image information items 113 illustrated in Fig. 5 (S4). In the example illustrated in Fig. 4, text information items are removed, and image information items are acquired from URLs in which the image information items are stored. [0084] Moreover, the image-information classification unit 102 classifies individual images included in the microblog image information items 113 illustrated in Fig. 4 into the categories to obtain results, and stores the results as the microblog-image-information classification results 115 in the memory 11 as illustrated in Fig. 7. [0085] Next, the category associating unit 103 stores the microblog-text-information classification results 114, which are illustrated in Fig. 6, and the microblog-image-information classification results 115, which are illustrated in Fig. 7, as the microblog-information classification results 116, which are illustrated in Fig. 8, in the memory 11 so that the microblog-text-information classification results 114 and the microblog-image-information classification results 115 are associated with each other on the basis of categories arranged in the category column. (4) Classification-result output operation [0086] Next, the classification-result output unit 104 outputs the microblog-information classification results 116 to the web server 3 (S7). [0087] The web server 3 generates the webpage on the basis of the microblog-information classification results 116. [0088] Note that "(2) microblog-information acquiring operation" and "(3) microblog classification operation" may be performed at predetermined time intervals or every time the microblog information items stored in the microblog server 2 are updated. In such a case, the microblog-information classification results 116 are successively transmitted from the microblog classification server 1 to the web server 3. Furthermore, it is supposed that the web server 3 updates the contents of the webpage every time the microblog-information classification results 116 are received. - 16 [0089] Meanwhile, the terminal 4 transmits, to the web server 3, a request to view the webpage. In response to the request to view the webpage, the web server 3 transmits a webpage information item corresponding to the webpage to the terminal 4. [0090] When the terminal 4 receives the webpage information item from the web server 3, the terminal 4 performs a display process on the webpage information item, and displays the webpage on the display. [0091] Fig. 10 is a schematic diagram illustrating an example of a display screen that is obtained by performing a display process on the webpage information item generated on the basis of the microblog-information classification results 116. [0092] A webpage display screen 40 is, for example, a display screen showing a fireworks display that was held in the past. The webpage display screen 40 includes a title 400, image display regions 401a and 402a ---, and text display regions 401b and 402b ---. In the image display regions 401 a and 402a ---, image information items are displayed for each of the categories into which the image information item have been classified by the image information classification unit 102. In the text display regions 401 b and 402b -- -, text information items are displayed for each of the categories into which the text information item have been classified by the text-information classification unit 101. [0093] Note that, although the web server 3 updates the contents of the webpage information item every time the web server 3 receives the microblog-information classification results 116, the image display regions 401a and 402a -- and the text display regions 401b and 402b -- may be updated independently of each other. The reason for this is that, in a microblog that is typically operated at present, in most cases, the number of image information items is smaller than the number of text information items. In other words, the reason is as follows. The number of image information items (for example, the number of still images) for each of the categories into which the image information items have been classified and the number of the text information items (the number of posted information items input as text items) for each of the categories into which the text information items have been classified are different from each other. If the image display regions 401a and 402a - and the text display regions 401 b and 402b -- are simultaneously updated, the area of the text display regions 401 b and 402b -- that are necessary to display text information items becomes larger than that of the image display regions 401a and 402a --- that are necessary to display image information items. Consequently, the balance therebetween is lost. - 17 [0094] Accordingly, time intervals at which display of image information items is updated and time intervals at which display of text information items is updated may be different from each other. In this case, the time intervals at which display of image information items is updated are set to be longer than the time intervals at which display of text information items is updated, whereby the above-described issue may be addressed. Moreover, the time intervals at which display of image information items is updated may be set to be shorter than the time intervals at which display of text information items is updated. [0095] For example, it is supposed that 100 images are classified as image information items related to a category of "fireworks", and 1000 text information items are classified as text information items related to the category of "fireworks". In this case, the time intervals at which display of image information items is updated are set to be longer than the time intervals at which display of text information items is updated (for example, display of image information items is updated at time intervals which are 10 times the time intervals at which display of text information items is updated), whereby, in the case where the text display regions 401 b and 402b -- and the text display regions 401 b and 402b -- are displayed, the balance therebetween may be maintained. [0096] Additionally, the types of webpages placed on the web server 3 are not particularly limited. In addition to a report of an event held in the past as illustrated in Fig. 10, examples of the types of webpages placed on the web server 3 include a complaint report that is viewed by members of the executive committee of an event and a summary website related to a certain keyword. Advantages of exemplary embodiment [0097] In the foregoing first exemplary embodiment, the microblog information items 111 are divided into the microblog text information items 112 and the microblog image information items 113. Classification of the microblog text information items 112 into categories and classification of the microblog image information items 113 into categories are performed independently of each other. The microblog text information items 112 and the microblog image information items 113 are associated with each other for each of the categories. Accordingly, posted information items in which multiple text information items and image information items are mixed and which are displayed in chronological order may be classified. -18 [0098] Furthermore, in the case where a text item and a URL of an image are included in one posted information item, if the posted information item including the text item and the URL of an image is displayed as a classification result, a category into which the text item has been classified and a category into which the image has been classified do not necessarily match. However, the accuracy at which the content of a text information item and the content of an image information item match may be improved, compared with that in the case where the posted information item including the text item and the URL of an image is displayed as a classification result. Second Exemplary Embodiment [0099] Fig. 11 is a block diagram illustrating an example of a configuration of a microblog classification server according to a second exemplary embodiment. The same components in the first exemplary embodiment are denoted by the same reference numerals. [0100] A microblog classification server 1A according to the second exemplary embodiment is obtained by adding category association information items 117 to the configuration of the microblog classification server 1 according the first exemplary embodiment. Furthermore, the operation of the category associating unit 103 in the second exemplary embodiment is different from that of the category associating unit 103 in the first exemplary embodiment. [0101] Fig. 12 is a schematic diagram illustrating an example of a configuration of the category association information items 117. [0102] The category association information items 117 include an overall category column, a text information category column, and an image information category column. In the overall category column, categories used to associate the categories for text information items and the categories for image information items with each other are arranged. In the text information category column, the categories into which the microblog text information items 112 are classified are arranged. In the image information category column, the categories into which the microblog image information items 113 are classified are arranged. Operations in second exemplary embodiment [0103] Operations in the second exemplary embodiment are similar to the operations in the first exemplary embodiment except an operation described below. Accordingly, a description thereof is omitted. -19 [0104] The category associating unit 103 associates the microblog-text-information classification results 114, which are illustrated in Fig. 6 and the microblog-image-information classification results 115, which are illustrated in Fig. 7, with each other on the basis of the categories arranged in the category columns and the category association information items 117, and stores the classification results as the microblog-information classification results 116 in the memory 11. [0105] For example, image information items are classified using image analysis. Accordingly, image information items are not classified into subjective categories such as a category of "obstacles" and a category of "disappointed", and classified into objective categories such as a category of "wait in line". In the first exemplary embodiment, image information items are not directly associated with text information items classified into categories such as the category of "obstacles" and the category of "disappointed". However, in the second exemplary embodiment, in the case where a category of "negative" is set as an overall category, image information items classified into the category of "wait in line" and text information items classified into categories such as the category of "obstacles" and the category of "disappointed" may be associated with each other as information items classified into the same category. Advantages of second exemplary embodiment [0106] In the foregoing second exemplary embodiment, a configuration is used, in which, in the case where the category associating unit 103 associates text information items and image information items, the text information items and the image information items are associated with each other on the basis of the category association information items 117 instead of perfect matching of categories. Thus, information items posted on the microblog may be classified into categories that are conceptually broader than those in the first exemplary embodiment. [0107] Note that the category associating unit 103 may associate, using parameters other than categories, image information items and text information items with each other. For example, in the case where text information items and image information items include time information items, a condition where the time information item included in an image information item and the time information item included in a text information item are within a certain time period may be used as a condition under which the image information item and the text information item are associated with each other. Moreover, in the case where text information - 20 items and image information items include user information items, a condition where the user information item included in an image information item and the user information item included in a text information item match may be used as a condition under which the image information item and the text information item are associated with each other. Additionally, in the case where text information items and image information items include location information items such as global positioning system (GPS) information items, a condition where the location information item included in an image information item and the location information item included in a text information item match may be used as a condition under which the image information item and the text information item are associated with each other. [0108] Furthermore, the text-information classification unit 101 and the image-information classification unit 102 may change the degrees of association with the categories into a numeric form using scores. The category associating unit 103 may associate the microblog text-information classification results 114 and the microblog-image-information classification results 115 with each other on the basis of the scores. Other Exemplary Embodiments [0109] Note that the present invention is not limited to the foregoing exemplary embodiment, and various modifications may be made without departing from the scope of the present invention. For example, the microblog is not limited to Twitter (registered trademark), and any type of medium may be used if the medium is a medium on which comparatively short text items are posted, in which text information items and image information items (including still images, moving images, and link destination information items concerning links to information items concerning the still images or moving images) are mixed, and in which a large number of text information items and image information items are displayed in chronological order, such as Facebook (registered trademark). Furthermore, for example, messages of mail may be targets to be processed as posted information items. [0110] In the foregoing exemplary embodiment, the functions of the individual units 100 to 104 included in the controller 10 are realized by a program. However, all or some of the individual units may be realized by hardware such as an application-specific integrated circuit (ASIC). Furthermore, the program used in the foregoing exemplary embodiment may be stored on a recording medium, such as a compact disc read-only memory (CD-ROM), and supplied. Moreover, the steps described in the foregoing exemplary embodiment may be, for example, replaced, removed, or added without changing the scope of the present invention. - 21 [0111] The foregoing description of the exemplary embodiments of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in the art. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, thereby enabling others skilled in the art to understand the invention for various embodiments and with the various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalents. [0112] In the claims which follow and in the preceding description of the invention, except where the context requires otherwise due to express language or necessary implication, the word "comprise" or variations such as "comprises" or "comprising" is used in an inclusive sense, i.e. to specify the presence of the stated features but not to preclude the presence or addition of further features in various embodiments of the invention.
权利要求:
Claims (8) [1] 1. An information classification program comprising: acquiring a plurality of posted information items, each of the plurality of posted information items including at least either of a text information item and an image information item; generating text information items including a plurality of text items in such a manner that image information items are removed from the plurality of posted information items, and individually classifying the text items included in the text information items into first categories; generating image information items including a plurality of images in such a manner that text information items are removed from the plurality of posted information items, and individually classifying the images included in the image information items into second categories; associating the classified text items and the classified images with each other on the basis of the first and second categories to obtain results; and outputting the text items and the images, which have been associated with each other, for each of the results. [2] 2. The information classification program according to Claim 1, wherein, in the outputting, the text items and the images are output independently of each other for each of the results. [3] 3. The information classification program according to Claim 1 or 2, wherein, in the associating, association information items in which the first categories and the second categories are associated with each other using overall categories are provided, and the classified text items and the classified images are associated with each other on the basis of the first categories, the second categories, and the overall categories included in the association information items. [4] 4. An information classification method comprising: acquiring a plurality of posted information items, each of the plurality of posted information items including at least either of a text information item and an image information item; - 23 generating text information items including a plurality of text items in such a manner that image information items are removed from the plurality of posted information items, and individually classifying the text items included in the text information items into first categories; generating image information items including a plurality of images in such a manner that text information items are removed from the plurality of posted information items, and individually classifying the images included in the image information items into second categories; associating the classified text items and the classified images with each other on the basis of the first and second categories to obtain results; and outputting the text items and the images, which have been associated with each other, for each of the results. [5] 5. An information processing apparatus comprising: an acquisition unit that acquires a plurality of posted information items, each of the plurality of posted information items including at least either of a text information item and an image information item; a text classification unit that generates text information items including a plurality of text items in such a manner that image information items are removed from the plurality of posted information items, and that individually classifies the text items included in the text information items into first categories; an image classification unit that generates image information items including a plurality of images in such a manner that text information items are removed from the plurality of posted information items, and that individually classifies the images included in the image information items into second categories; an associating unit that associates the classified text items and the classified images with each other on the basis of the first and second categories to obtain results; and an output unit that outputs the text items and the images, which have been associated with each other by the associating unit, for each of the results. [6] 6. An information classification program substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples. - 24 [7] 7. An information classification method substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples. [8] 8. An information processing apparatus substantially as herein described with reference to any one of the embodiments of the invention illustrated in the accompanying drawings and/or examples.
类似技术:
公开号 | 公开日 | 专利标题 US9390144B2|2016-07-12|Objective and subjective ranking of comments US8495058B2|2013-07-23|Filtering social search results US20120197993A1|2012-08-02|Skill ranking system US20130332385A1|2013-12-12|Methods and systems for detecting and extracting product reviews US20120240030A1|2012-09-20|System and Method for Transmitting a Feed Related to a First User to a Second User US20120089681A1|2012-04-12|Prioritizing Messages Within a Message Network US11205195B2|2021-12-21|Information processing device, information processing method, and information processing program US20140136517A1|2014-05-15|Apparatus And Methods for Providing Search Results US20180365710A1|2018-12-20|Website interest detector US20150356102A1|2015-12-10|Automatic article enrichment by social media trends AU2013201018B2|2014-08-28|Information classification program, information classification method, and information processing apparatus KR20160104067A|2016-09-02|Generating a news timeline and recommended news editions US20150193444A1|2015-07-09|System and method to determine social relevance of Internet content US10185765B2|2019-01-22|Non-transitory computer-readable medium, information classification method, and information processing apparatus WO2020217414A1|2020-10-29|Information processing device JP6019156B2|2016-11-02|Information processing apparatus, information processing method, and information processing program US20130254713A1|2013-09-26|Sourcing and Work Product Techniques US10402449B2|2019-09-03|Information processing system, information processing method, and information processing program US20150317608A1|2015-11-05|Job recruiter and job applicant connector WO2014002614A1|2014-01-03|Related content retrieval device and related content retrieval method JP5880350B2|2016-03-09|Information search program and information search apparatus JP2013218369A|2013-10-24|Information processing unit, program, information processing method and commodity introduction support system JP2017161963A|2017-09-14|Video searching device and program JP2016051195A|2016-04-11|Content providing apparatus, method of providing content, and program JP6913596B2|2021-08-04|Information processing equipment, information processing methods and information processing programs
同族专利:
公开号 | 公开日 US8930367B2|2015-01-06| US20140025682A1|2014-01-23| AU2013201018B2|2014-08-28| JP5895756B2|2016-03-30| JP2014021645A|2014-02-03|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题 JP3067966B2|1993-12-06|2000-07-24|松下電器産業株式会社|Apparatus and method for retrieving image parts| US20010012062A1|1998-07-23|2001-08-09|Eric C. Anderson|System and method for automatic analysis and categorization of images in an electronic imaging device| US20060136467A1|2004-12-17|2006-06-22|General Electric Company|Domain-specific data entity mapping method and system| US7640218B2|2005-01-18|2009-12-29|Fuji Xerox Co., Ltd.|Efficient methods for temporal event clustering of digital photographs| US20090276500A1|2005-09-21|2009-11-05|Amit Vishram Karmarkar|Microblog search engine system and method| US7930647B2|2005-12-11|2011-04-19|Topix Llc|System and method for selecting pictures for presentation with text content| US7751592B1|2006-01-13|2010-07-06|Google Inc.|Scoring items| JP5182854B2|2007-09-07|2013-04-17|富岳通運株式会社|Event management system| JP2009193133A|2008-02-12|2009-08-27|C2Cube Inc|Information processor, information processing method, and program| US8724007B2|2008-08-29|2014-05-13|Adobe Systems Incorporated|Metadata-driven method and apparatus for multi-image processing| US8542950B2|2009-06-02|2013-09-24|Yahoo! Inc.|Finding iconic images| US8478052B1|2009-07-17|2013-07-02|Google Inc.|Image classification| JP2011129009A|2009-12-21|2011-06-30|Cybird Co Ltd|Short sentence communication method| US20120084323A1|2010-10-02|2012-04-05|Microsoft Corporation|Geographic text search using image-mined data| JP2012079166A|2010-10-04|2012-04-19|Fujifilm Imagetec Co Ltd|Information display system| US20120109754A1|2010-11-03|2012-05-03|Microsoft Corporation|Sponsored multi-media blogging| US8732584B2|2010-11-09|2014-05-20|Palo Alto Research Center Incorporated|System and method for generating an information stream summary using a display metric| WO2012094945A1|2011-01-11|2012-07-19|腾讯科技(深圳)有限公司|Microblog message processing method and device thereof| US20130159277A1|2011-12-14|2013-06-20|Microsoft Corporation|Target based indexing of micro-blog content| US9645724B2|2012-02-01|2017-05-09|Facebook, Inc.|Timeline based content organization| US9098532B2|2012-11-29|2015-08-04|International Business Machines Corporation|Generating alternative descriptions for images| US20140164371A1|2012-12-10|2014-06-12|Rawllin International Inc.|Extraction of media portions in association with correlated input|US9311568B1|2014-05-21|2016-04-12|Yummly, Inc.|Recipe text and image extraction| US9984100B2|2015-09-29|2018-05-29|International Business Machines Corporation|Modification of images and associated text| US10409814B2|2017-01-26|2019-09-10|International Business Machines Corporation|Network common data form data management|
法律状态:
2015-01-15| FGA| Letters patent sealed or granted (standard patent)| 2021-09-09| HB| Alteration of name in register|Owner name: FUJIFILM BUSINESS INNOVATION CORP. Free format text: FORMER NAME(S): FUJI XEROX CO., LTD. |
优先权:
[返回顶部]
申请号 | 申请日 | 专利标题 JP2012158601A|JP5895756B2|2012-07-17|2012-07-17|Information classification program and information processing apparatus| JP2012-158601||2012-07-17|| 相关专利
Sulfonates, polymers, resist compositions and patterning process
Washing machine
Washing machine
Device for fixture finishing and tension adjusting of membrane
Structure for Equipping Band in a Plane Cathode Ray Tube
Process for preparation of 7 alpha-carboxyl 9, 11-epoxy steroids and intermediates useful therein an
国家/地区
|