Transportation Economic Development Impact System

Transportation Economic Development Impact System

Transportation Economic Development Impact System (TREDIS) is an economic analysis system sold by consulting firm Economic Development Research Group that is used in planning major transportation investments in the US and Canada. The role of economic impact analysis and TREDIS in the transportation planning process is explained in guidebooks of the US Department of Transportation and the American Association of State Highway and Transportation Officials. TREDIS has been most commonly used for assessing the expected economic impacts of statewide highway programs, regional multi-modal plans and public transport investment. Its history and theoretical foundation are explained in peer reviewed journal articles. == How It Works == TREDIS has a series of modules that calculate different forms of impacts and benefits. One module is an accounting framework that calculates user benefits, including impacts on cargo transportation and commuting costs, based on transportation forecasting results. A second module calculates wider economic development benefits, including impacts on business productivity, economic development and multiplier effects from the input-output analysis. It applies an economic model to estimate impacts on jobs, income, gross regional product and business output, by sector of the economy. A third module applies cost-benefit analysis from alternative perspectives.

Automatic summarization

Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence (AI) algorithms are commonly developed and employed to achieve this, specialized for different types of data. Text summarization is usually implemented by natural language processing methods, designed to locate the most informative sentences in a given document. On the other hand, visual content can be summarized using computer vision algorithms. Image summarization is the subject of ongoing research; existing approaches typically attempt to display the most representative images from a given image collection, or generate a video that only includes the most important content from the entire collection. Video summarization algorithms identify and extract from the original video content the most important frames (key-frames), and/or the most important video segments (key-shots), normally in a temporally ordered fashion. Video summaries simply retain a carefully selected subset of the original video frames and, therefore, are not identical to the output of video synopsis algorithms, where new video frames are being synthesized based on the original video content. == Commercial products == In 2022 Google Docs released an automatic summarization feature. == Approaches == There are two general approaches to automatic summarization: extraction and abstraction. === Extraction-based summarization === Here, content is extracted from the original data, but the extracted content is not modified in any way. Examples of extracted content include key-phrases that can be used to "tag" or index a text document, or key sentences (including headings) that collectively comprise an abstract, and representative images or video segments, as stated above. For text, extraction is analogous to the process of skimming, where the summary (if available), headings and subheadings, figures, the first and last paragraphs of a section, and optionally the first and last sentences in a paragraph are read before one chooses to read the entire document in detail. Other examples of extraction that include key sequences of text in terms of clinical relevance (including patient/problem, intervention, and outcome). === Abstractive-based summarization === Abstractive summarization methods generate new text that did not exist in the original text. This has been applied mainly for text. Abstractive methods build an internal semantic representation of the original content (often called a language model), and then use this representation to create a summary that is closer to what a human might express. Abstraction may transform the extracted content by paraphrasing sections of the source document, to condense a text more strongly than extraction. Such transformation, however, is computationally much more challenging than extraction, involving both natural language processing and often a deep understanding of the domain of the original text in cases where the original document relates to a special field of knowledge. "Paraphrasing" is even more difficult to apply to images and videos, which is why most summarization systems are extractive. === Aided summarization === Approaches aimed at higher summarization quality rely on combined software and human effort. In Machine Aided Human Summarization, extractive techniques highlight candidate passages for inclusion (to which the human adds or removes text). In Human Aided Machine Summarization, a human post-processes software output, in the same way that one edits the output of automatic translation by Google Translate. == Applications and systems for summarization == There are broadly two types of extractive summarization tasks depending on what the summarization program focuses on. The first is generic summarization, which focuses on obtaining a generic summary or abstract of the collection (whether documents, or sets of images, or videos, news stories etc.). The second is query relevant summarization, sometimes called query-based summarization, which summarizes objects specific to a query. Summarization systems are able to create both query relevant text summaries and generic machine-generated summaries depending on what the user needs. An example of a summarization problem is document summarization, which attempts to automatically produce an abstract from a given document. Sometimes one might be interested in generating a summary from a single source document, while others can use multiple source documents (for example, a cluster of articles on the same topic). This problem is called multi-document summarization. A related application is summarizing news articles. Imagine a system, which automatically pulls together news articles on a given topic (from the web), and concisely represents the latest news as a summary. Image collection summarization is another application example of automatic summarization. It consists in selecting a representative set of images from a larger set of images. A summary in this context is useful to show the most representative images of results in an image collection exploration system. Video summarization is a related domain, where the system automatically creates a trailer of a long video. This also has applications in consumer or personal videos, where one might want to skip the boring or repetitive actions. Similarly, in surveillance videos, one would want to extract important and suspicious activity, while ignoring all the boring and redundant frames captured. At a very high level, summarization algorithms try to find subsets of objects (like set of sentences, or a set of images), which cover information of the entire set. This is also called the core-set. These algorithms model notions like diversity, coverage, information and representativeness of the summary. Query based summarization techniques, additionally model for relevance of the summary with the query. Some techniques and algorithms which naturally model summarization problems are TextRank and PageRank, Submodular set function, Determinantal point process, maximal marginal relevance (MMR) etc. === Keyphrase extraction === The task is the following. You are given a piece of text, such as a journal article, and you must produce a list of keywords or key[phrase]s that capture the primary topics discussed in the text. In the case of research articles, many authors provide manually assigned keywords, but most text lacks pre-existing keyphrases. For example, news articles rarely have keyphrases attached, but it would be useful to be able to automatically do so for a number of applications discussed below. Consider the example text from a news article: "The Army Corps of Engineers, rushing to meet President Bush's promise to protect New Orleans by the start of the 2006 hurricane season, installed defective flood-control pumps last year despite warnings from its own expert that the equipment would fail during a storm, according to documents obtained by The Associated Press". A keyphrase extractor might select "Army Corps of Engineers", "President Bush", "New Orleans", and "defective flood-control pumps" as keyphrases. These are pulled directly from the text. In contrast, an abstractive keyphrase system would somehow internalize the content and generate keyphrases that do not appear in the text, but more closely resemble what a human might produce, such as "political negligence" or "inadequate protection from floods". Abstraction requires a deep understanding of the text, which makes it difficult for a computer system. Keyphrases have many applications. They can enable document browsing by providing a short summary, improve information retrieval (if documents have keyphrases assigned, a user could search by keyphrase to produce more reliable hits than a full-text search), and be employed in generating index entries for a large text corpus. Depending on the different literature and the definition of key terms, words or phrases, keyword extraction is a highly related theme. ==== Supervised learning approaches ==== Beginning with the work of Turney, many researchers have approached keyphrase extraction as a supervised machine learning problem. Given a document, we construct an example for each unigram, bigram, and trigram found in the text (though other text units are also possible, as discussed below). We then compute various features describing each example (e.g., does the phrase begin with an upper-case letter?). We assume there are known keyphrases available for a set of training documents. Using the known keyphrases, we can assign positive or negative labels to the examples. Then we learn a classifier that can discriminate between positive and negative examples as a function of the features. Some classifiers make a binary classification for a test example, while others assign a probability of being a keyphrase. For ins

NAPLPS

NAPLPS (North American Presentation Layer Protocol Syntax) is a graphics language for use originally with videotex and teletext services. NAPLPS was developed from the Telidon system developed in Canada, with a small number of additions from AT&T Corporation. The basics of NAPLPS were later used as the basis for several other microcomputer-based graphics systems. == History == The Canadian Communications Research Centre (CRC), based in Ottawa, had been working on various graphics systems since the late 1960s, much of it led by Herb Bown. Through the 1970s they turned their attention to building out a system of "picture description instructions", which encoded graphics commands as a text stream. Graphics were encoded as a series of instructions (graphics primitives) each represented by a single ASCII character. Graphic coordinates were encoded in multiple 6-bit strings of XY coordinate data, flagged to place them in the printable ASCII range so that they could be transmitted with conventional text transmission techniques. ASCII SI/SO characters were used to differentiate the text from graphic portions of a transmitted "page". These instructions were decoded by separate programs to produce graphics output, on a plotter for instance. Other work produced a fully interactive version. In 1975, the CRC gave a contract to Norpak to develop an interactive graphics terminal that could decode the instructions and display them on a color display. During this period, a number of companies were developing the first teletext systems, notably the BBC's Ceefax system. Ceefax encoded character data into the lines in the vertical blanking interval of normal television signals where they could not be seen on-screen, and then used a buffer and decoder in the user's television to convert these into "pages" of text on the display. The Independent Broadcasting Authority quickly introduced their own ORACLE system, and the two organizations subsequently agreed to use a single standard, the "Broadcast Teletext Specification". This later became World System Teletext. At about the same time, other organizations were developing videotex systems, similar to teletext except they used modems to transmit their data instead of television signals. This was potentially slower and used up a telephone line, but had the major advantage of allowing the user to transmit data back to the sender. The UK's General Post Office developed a system using the Ceefax/ORACLE standard, launching it as Prestel, while France prepared the first steps for its ultimately very successful Minitel system, using a rival display standard called Antiope. By 1977, the Norpak system was running, and from this work the CRC decided to create their own teletext/videotext system. Unlike the systems being rolled out in Europe, the CRC decided from the start that the system should be able to run on any combination of communications links. For instance, it could use the vertical blanking interval to send data to the user, and a modem to return selections to the servers. It could be used in a one-way or two-way system. In teletext mode, character codes were sent to users' televisions by encoding them as dot patterns in the vertical blanking interval of the video signal. Various technical "tweaks" and details of the NTSC signals used by North American televisions allowed the downstream videotex channel to increase to 600 bit/s, about twice that used in the European systems. In videotext mode, Bell 202 modems were typical, offering a 1,200 bit/s download rate. A set top box attached to the TV decoded these signals back into text and graphics pages, which the user could select among. The system was publicly launched as Telidon on August 15, 1978. Compared to the European standards, the CRC system was faster, bi-directional, and offered real graphics as opposed to simple character graphics. The downside of the system was that it required much more advanced decoders, typically featuring Zilog Z80 or Motorola 6809 processors with RGB and/or RF output. The Innovation, Science and Economic Development Canada (then Department of Communications) launched a four-year plan to fund public roll-outs of the technology in an effort to spur the development of a commercial Telidon system. AT&T Corporation was so impressed by Telidon that they decided to join the project. They added a number of useful extensions, notably the ability to define original graphics commands (macro) and character sets (DRCS). They also tabled algorithms for proportionally spaced text, which greatly improved the quality of the displayed pages. A joint CSA/ANSI working group (X3L2.1) revised the specifications, which were submitted for standardization. In 1983, they became CSA T500 and ANSI X3.110, or NAPLPS. The data encoding system was also standardized as the NABTS (North American Broadcast Teletext Specification) protocol. Business models for Telidon services were poorly developed. Unlike the UK, where teletext was supported by one of only two large companies whose whole revenue model was based on a read-only medium (television), in North America Telidon was being offered by companies who worked on a subscriber basis. == One-way systems == Telidon-based teletext was tested in a few North American trials in the early 1980s — CBC IRIS, TVOntario, MTS-sponsored Project IDA, to name a few. NAPLPS was also part of the NABTS teletext standard, for the encoding and display of teletext pages. In the late 1980s and early 1990s, affiliates of the regional sports network group SportsChannel ran a service called Sports Plus Network, which ran sports news and scores while SportsChannel was not otherwise on the air. The screens, which frequently featured team logos or likenesses of players in addition to text, were drawn entirely with NAPLPS graphics and resembled the loading of Prodigy pages over a modem, though slightly faster. == Two-way systems == Various two-way systems using NAPLPS appeared in North America in the early 1980s. The biggest North American examples were Knight Ridder's Viewtron (based in Miami) and the Los Angeles Times' Gateway service (based in Orange County). Both used the Sceptre NAPLPS terminal from AT&T. The Sceptre contained a slow modem that connected over the consumer's telephone line to host computers. The Sceptre was expensive whether purchased or rented. Despite huge investments by their parent companies, neither Viewtron nor Gateway lasted into the second half of the decade. Another system, Keyfax, was developed by Keycom Electronic Publishing, a joint venture of Honeywell, Centel (since acquired by Sprint) and Field Enterprises, then-owner of the Chicago Sun-Times newspaper. Keyfax had originally been a WST teletext service, broadcast overnights on Field's Chicago television station WFLD-32 and through the VBI of both WFLD and national superstation WTBS; the decision was made to convert Keyfax into a subscription service, using a proprietary NAPLPS terminal device in a last-ditch effort to save the service. It did not work and Keyfax had ceased operations by the end of 1986. Other early-1980s NAPLPS technology was deployed in Canada, both as a way for rural Canadians to get news and weather information and as the platform for touchscreen information kiosks. In Vancouver these were featured at Expo 86. The kiosks became ubiquitous in Toronto under the name Teleguide, and were deployed in many shopping centres and at major tourist attractions. The latter city was the North American nexus of NAPLPS and the home of Norpak, the most successful of NAPLPS-oriented developers. Norpak created and sold hardware and software for NAPLPS development and display. TVOntario also developed NAPLPS content creation software. London, Ontario - based Cableshare used NAPLPS as the basis of touch-screen information kiosks for shopping malls, the flagship of which was deployed at Toronto's Eaton Centre. The system relied on an 8085-based microcomputer which drove several NAPLPS terminals fitted with touch screens, all communicating via Datapac to a back end database. The system offered news, weather and sports information along with shopping mall guides and coupons. Cableshare also developed and sold a leading NAPLPS page creation utility called the "Picture Painter." In the late 1980s, Tribune Media Services (TMS) and the Associated Press operated a cable television channel called AP News Plus that provided NAPLPS-based news screens to cable television subscribers in many U.S. cities. The news pages were created and edited by TMS staffers working on an Atex editing system in Orlando, Florida, and sent by satellite to NAPLPS decoder devices located at the local cable television companies. Among the firms providing technology to TMS and the Associated Press for the AP News Plus channel was Minneapolis-based Electronic Publishers Inc. (1985–1988). In 1981, two amateur radio operators (VE3FTT and VE3GQW) received special permission from the Canad

Intrapixel and Interpixel processing

Intrapixel and Interpixel processing is used in the processing of computers graphics, as well as sensors and images in equipment such as cameras. For computer graphics, CMOS sensor processing is done in pixel level. This process includes two general categories: intrapixel processing, where the processing is performed on the individual pixel signals, and interpixel processing, where the processing is performed locally or globally on signals from several pixels. The purpose of interpixel processing is to perform early vision processing, not merely to capture images. Intrapixel and Interpixel processing is an integral part of spatial processing within the earth Mixed Spatial Attraction Model. This also includes use within hyperspectral image processing.

Distinguishable interfaces

Distinguishable interfaces use computer graphic principles to automatically generate easily distinguishable appearance for computer data. Although the desktop metaphor revolutionized user interfaces, there is evidence that a spatial layout alone does little to help in locating files and other data; distinguishable appearance is also required. Studies have shown that average users have considerable difficulty finding files on their personal computers, even ones that they created the same day. Search engines do not always help, since it has been found that users often know of the existence of a file without being able to specify relevant search terms. On the contrary, people appear to incrementally search for files using some form of context. Recently researchers and web developers have argued that the problem is the lack of distinguishable appearance: in the traditional computer interface most objects and locations appear identical. This problem rarely occurs in the real world, where both objects and locations generally have easily distinguishable appearance. Discriminability was one of the recommendations in the ISO 9241-12 recommendation on presentation of information on visual displays (part of the overall report on Ergonomics of Human System Interaction), however it was assumed in that report that this would be achieved by manual design of graphical symbols. == VisualIDs, semanticons, and identicons == The mass availability of computer graphics supported the introduction of approaches that make better use of the brain's "visual hardware", by providing individual files and other abstract data with distinguishable appearance. This idea initially appeared in strictly academic VisualIDs and Semanticons works, but the web community has explored and rapidly adopted similar ideas, such as the Identicon. The VisualIDs project automatically generated icons for files or other data based on a hash of the data identifier, so the icons had no relation to the content or meaning of the data. It was argued not only that generating meaningful icons is unnecessary (their user study showed rapid learning of the arbitrary icons), but also that basing icons on content is actually incorrect ("contrasting visualization with visual identifiers"). The Semanticons project developed by Setlur et al. demonstrated an algorithm to create icons that reflect the content of files. In this work the name, location and content of a file are parsed and used to retrieve related image(s) from an image database. These are then processed using a Non-photorealistic rendering technique in order to generate graphical icons. Developer Don Park introduced the identicon library for making a visual icon from a hash of a data identifier. This initial public implementation has spawned a large number of implementations for various environments. In particular, identicons are now being used as default visual user identifiers (avatars) for several widely used systems. They are also used as a complement to Gravatars, which are pre-existing avatar images created or chosen by users, instead of automatically generated images. (see #External links). == Current research == While current web practice has followed the semantics-free approach of VisualIDs, recent research has followed the semantics-based approach of Semanticons. Examples include using data mining principles to automatically create "intelligent icons" that reflect the contents of files and creating icons for music files that reflect audio characteristics or affective content.

IBM ALP

IBM Assembly Language Processor (ALP) is an assembler written by IBM for 32-bit OS/2 Warp (OS/2 3.0), which was released in 1994. ALP accepts source programs compatible with Microsoft Macro Assembler (MASM) version 5.1, which was originally used to build many of the device drivers included with OS/2. For OS/2 versions 3 and 4, ALP was distributed, along with other tools and documentation, as part of the Device Driver Kit (DDK). The DDK was withdrawn in 2004 as part of IBM's discontinuance of OS/2.

ImageNet

The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes. == History == AI researcher Fei-Fei Li began working on the idea for ImageNet in 2006. At a time when most AI research focused on models and algorithms, Li wanted to expand and improve the data available to train AI algorithms. In 2007, Li met with Princeton professor Christiane Fellbaum, one of the creators of WordNet, to discuss the project. As a result of this meeting, Li went on to build ImageNet starting from the roughly 22,000 nouns of WordNet and using many of its features. She was also inspired by a 1987 estimate that the average person recognizes roughly 30,000 different kinds of objects. As an assistant professor at Princeton, Li assembled a team of researchers to work on the ImageNet project. They used Amazon Mechanical Turk to help with the classification of images. Labeling started in July 2008 and ended in April 2010. It took 49K workers from 167 countries filtering and labeling over 160M candidate images. They had enough budget to have each of the 14 million images labelled three times. The original plan called for 10,000 images per category, for 40,000 categories at 400 million images, each verified 3 times. They found that humans can classify at most 2 images/sec. At this rate, it was estimated to take 19 human-years of labor (without rest). They presented their database for the first time as a poster at the 2009 Conference on Computer Vision and Pattern Recognition (CVPR) in Florida, titled "ImageNet: A Preview of a Large-scale Hierarchical Dataset". The poster was reused at Vision Sciences Society 2009. In 2009, Alex Berg suggested adding object localization as a task. Li approached PASCAL Visual Object Classes contest in 2009 for a collaboration. It resulted in the subsequent ImageNet Large Scale Visual Recognition Challenge starting in 2010, which has 1000 classes and object localization, as compared to PASCAL VOC which had just 20 classes and 19,737 images (in 2010). === Significance for deep learning === On 30 September 2012, a convolutional neural network (CNN) called AlexNet achieved a top-5 error of 15.3% in the ImageNet 2012 Challenge, more than 10.8 percentage points lower than that of the runner-up. Using convolutional neural networks was feasible due to the use of graphics processing units (GPUs) during training, an essential ingredient of the deep learning revolution. According to The Economist, "Suddenly people started to pay attention, not just within the AI community but across the technology industry as a whole." In 2015, AlexNet was outperformed by Microsoft's very deep CNN with over 100 layers, which won the ImageNet 2015 contest, having 3.57% error on the test set. Andrej Karpathy estimated in 2014 that with concentrated effort, he could reach 5.1% error rate, and ~10 people from his lab reached ~12-13% with less effort. It was estimated that with maximal effort, a human could reach 2.4%. == Dataset == ImageNet crowdsources its annotation process. Image-level annotations indicate the presence or absence of an object class in an image, such as "there are tigers in this image" or "there are no tigers in this image". Object-level annotations provide a bounding box around the (visible part of the) indicated object. ImageNet uses a variant of the broad WordNet schema to categorize objects, augmented with 120 categories of dog breeds to showcase fine-grained classification. In 2012, ImageNet was the world's largest academic user of Mechanical Turk. The average worker identified 50 images per minute. The original plan of the full ImageNet would have roughly 50M clean, diverse and full resolution images spread over approximately 50K synsets. This was not achieved. The summary statistics given on April 30, 2010: Total number of non-empty synsets: 21841 Total number of images: 14,197,122 Number of images with bounding box annotations: 1,034,908 Number of synsets with SIFT features: 1000 Number of images with SIFT features: 1.2 million === Categories === The categories of ImageNet were filtered from the WordNet concepts. Each concept, since it can contain multiple synonyms (for example, "kitty" and "young cat"), so each concept is called a "synonym set" or "synset". There were more than 100,000 synsets in WordNet 3.0, majority of them are nouns (80,000+). The ImageNet dataset filtered these to 21,841 synsets that are countable nouns that can be visually illustrated. Each synset in WordNet 3.0 has a "WordNet ID" (wnid), which is a concatenation of part of speech and an "offset" (a unique identifying number). Every wnid starts with "n" because ImageNet only includes nouns. For example, the wnid of synset "dog, domestic dog, Canis familiaris" is "n02084071". The categories in ImageNet fall into 9 levels, from level 1 (such as "mammal") to level 9 (such as "German shepherd"). === Image format === The images were scraped from online image search (Google, Picsearch, MSN, Yahoo, Flickr, etc) using synonyms in multiple languages. For example: German shepherd, German police dog, German shepherd dog, Alsatian, ovejero alemán, pastore tedesco, 德国牧羊犬. ImageNet consists of images in RGB format with varying resolutions. For example, in ImageNet 2012, "fish" category, the resolution ranges from 4288 x 2848 to 75 x 56. In machine learning, these are typically preprocessed into a standard constant resolution, and whitened, before further processing by neural networks. For example, in PyTorch, ImageNet images are by default normalized by dividing the pixel values so that they fall between 0 and 1, then subtracting by [0.485, 0.456, 0.406], then dividing by [0.229, 0.224, 0.225]. These are the mean and standard deviations for ImageNet, so this whitens the input data. === Labels and annotations === Each image is labelled with exactly one wnid. Dense SIFT features (raw SIFT descriptors, quantized codewords, and coordinates of each descriptor/codeword) for ImageNet-1K were available for download, designed for bag of visual words. The bounding boxes of objects were available for about 3000 popular synsets with on average 150 images in each synset. Furthermore, some images have attributes. They released 25 attributes for ~400 popular synsets: Color: black, blue, brown, gray, green, orange, pink, red, violet, white, yellow Pattern: spotted, striped Shape: long, round, rectangular, square Texture: furry, smooth, rough, shiny, metallic, vegetation, wooden, wet === ImageNet-21K === The full original dataset is referred to as ImageNet-21K. ImageNet-21k contains 14,197,122 images divided into 21,841 classes. Some papers round this up and name it ImageNet-22k. The full ImageNet-21k was released in Fall of 2011, as fall11_whole.tar. There is no official train-validation-test split for ImageNet-21k. Some classes contain only 1-10 samples, while others contain thousands. === ImageNet-1K === There are various subsets of the ImageNet dataset used in various context, sometimes referred to as "versions". One of the most highly used subsets of ImageNet is the "ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012–2017 image classification and localization dataset". This is also referred to in the research literature as ImageNet-1K or ILSVRC2017, reflecting the original ILSVRC challenge that involved 1,000 classes. ImageNet-1K contains 1,281,167 training images, 50,000 validation images and 100,000 test images. Each category in ImageNet-1K is a leaf category, meaning that there are no child nodes below it, unlike ImageNet-21K. For example, in ImageNet-21K, there are some images categorized as simply "mammal", whereas in ImageNet-1K, there are only images categorized as things like "German shepherd", since there are no child-words below "German shepherd". === Later developments === In the WordNet they built ImageNet on, there were 2832 synsets in the "person" subtree. During 2018--2020 period, they removed the download of the ImageNet-21k as they went through extensive filtering in these person synsets. Out of these 2832 synsets, 1593 were deemed "potentially offensive". Out of the remaining 1239, 1081 were deemed not really "visual". The result was that only 158 syn