POPULARITY
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Smartphones have advanced from simple communication devices to multipurpose devices that capture almost every single moment in our daily lives and thus contain sensitive data like photos or contact information. In order to protect this data, users can choose from a variety of authentication schemes. However, what happens if one of these schemes fails, for example, when users are not able to provide the correct password within a limited number of attempts? So far, situations like this have been neglected by the usable security and privacy community that mainly focuses on primary authentication schemes. But fallback authentication is comparably important to enable users to regain access to their devices (and data) in case of lockouts. In theory, any scheme for primary authentication on smartphones could also be used as fallback solution. In practice, fallback authentication happens less frequently and imposes different requirements and challenges on its design. The aim of this work is to understand and address these challenges. We investigate the oc- currences of fallback authentication on smartphones in real life in order to grasp the charac- teristics that fallback authentication conveys. We also get deeper insights into the difficulties that users have to cope with during lockout situations. In combination with the knowledge from previous research, these insights are valuable to provide a detailed definition of fall- back authentication that has been missing so far. The definition covers usability and security characteristics and depicts the differences to primary authentication. Furthermore, we explore the potential of autobiographical memory, a part of the human memory that relates to personal experiences of the past, for the design of alternative fall- back schemes to overcome the well-known memorability issues of current solutions. We present the design and evaluation of two static approaches that are based on the memory of locations and special drawings. We also cover three dynamic approaches that relate to re- cent smartphone activities, icon arrangements and installed apps. This series of work allows us to analyze the suitability of different types of memories for fallback authentication. It also helps us to extend the definition of fallback authentication by identifying factors that influence the quality of fallback schemes. The main contributions of this thesis can be summarized as follows: First, it gives essen- tial insights into the relevance, frequency and problems of fallback authentication on smart- phones in real life. Second, it provides a clear definition of fallback authentication to classify authentication schemes based on usability and security properties. Third, it shows example implementations and evaluations of static and dynamic fallback schemes that are based on different autobiographical memories. Finally, it discusses the advantages and disadvantages of these memories and gives recommendations for their design, evaluation and analysis in the context of fallback authentication.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Mon, 29 Feb 2016 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/19406/ https://edoc.ub.uni-muenchen.de/19406/1/Zherdin_Andrew.pdf Zherdin, Andrew ddc:000, Fakultät für Mathematik, Informatik und Stat
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Metabolic processes, signal transduction, gene regulation, as well as gene and protein expression are largely controlled by biological networks. High-throughput experiments allow the measurement of a wide range of cellular states and interactions. However, networks are often not known in detail for specific biological systems and conditions. Gene and protein annotations are often transferred from model organisms to the species of interest. Therefore, the question arises whether biological networks can be transferred between species or whether they are specific for individual contexts. In this thesis, the following aspects are investigated: (i) the conservation and (ii) the cross-species transfer of eukaryotic protein-interaction and gene regulatory (transcription factor- target) networks, as well as (iii) the conservation of alternatively spliced variants. In the simplest case, interactions can be transferred between species, based solely on the sequence similarity of the orthologous genes. However, such a transfer often results either in the transfer of only a few interactions (medium/high sequence similarity threshold) or in the transfer of many speculative interactions (low sequence similarity threshold). Thus, advanced network transfer approaches also consider the annotations of orthologous genes involved in the interaction transfer, as well as features derived from the network structure, in order to enable a reliable interaction transfer, even between phylogenetically very distant species. In this work, such an approach for the transfer of protein interactions is presented (COIN). COIN uses a sophisticated machine-learning model in order to label transferred interactions as either correctly transferred (conserved) or as incorrectly transferred (not conserved). The comparison and the cross-species transfer of regulatory networks is more difficult than the transfer of protein interaction networks, as a huge fraction of the known regulations is only described in the (not machine-readable) scientific literature. In addition, compared to protein interactions, only a few conserved regulations are known, and regulatory elements appear to be strongly context-specific. In this work, the cross-species analysis of regulatory interaction networks is enabled with software tools and databases for global (ConReg) and thousands of context-specific (CroCo) regulatory interactions that are derived and integrated from the scientific literature, binding site predictions and experimental data. Genes and their protein products are the main players in biological networks. However, to date, the aspect is neglected that a gene can encode different proteins. These alternative proteins can differ strongly from each other with respect to their molecular structure, function and their role in networks. The identification of conserved and species-specific splice variants and the integration of variants in network models will allow a more complete cross-species transfer and comparison of biological networks. With ISAR we support the cross-species transfer and comparison of alternative variants by introducing a gene-structure aware (i.e. exon-intron structure aware) multiple sequence alignment approach for variants from orthologous and paralogous genes. The methods presented here and the appropriate databases allow the cross-species transfer of biological networks, the comparison of thousands of context-specific networks, and the cross-species comparison of alternatively spliced variants. Thus, they can be used as a starting point for the understanding of regulatory and signaling mechanisms in many biological systems.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Thu, 26 Nov 2015 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/19007/ https://edoc.ub.uni-muenchen.de/19007/1/Schauberger_Gunther.pdf Schauberger, Gunther ddc:004, ddc:000, Fakultät für Mathematik, Informatik und Statisti
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Durch die immer größeren technischen Möglichkeiten mobiler Endgeräte sind die Voraussetzungen erfüllt, um diese zum mobilen Arbeiten oder zur Steuerung von industriellen Fertigungsprozessen einzusetzen. Aus Gründen der Informations- und Betriebssicherheit, sowie zur Umsetzung funktionaler Anforderungen, ist es aber vielfach erforderlich, die Verfügbarkeit von entsprechenden Zugriffsrechten auf Nutzer innerhalb autorisierter Zonen zu begrenzen. So kann z.B. das Auslesen kritischer Daten auf individuelle Büros oder die mobile Steuerung von Maschinen auf passende Orte innerhalb einer Fabrikhalle beschränkt werden. Dazu muss die Position des Nutzers ermittelt werden. Im realen Einsatz können Positionsschätzungen jedoch mit Fehlern in der Größe von autorisierten Zonen auftreten. Derzeit existieren noch keine Lösungen, welche diese Fehler in Autorisierungsentscheidungen berücksichtigen, um einhergehenden Schaden aus Falschentscheidungen zu minimieren. Ferner existieren derzeit keine Verfahren, um die Güteeigenschaften solcher Ortsbeschränkungen vor deren Ausbringung zu analysieren und zu entscheiden, ob ein gegebenes Positionierungssystem aufgrund der Größe seiner Positionsfehler geeignet ist. In der vorliegenden Arbeit werden deshalb Lösungen zur Erfassung und Behandlung solcher Positionsfehler im Umfeld der standortbasierten Autorisierung vorgestellt. Hierzu wird zunächst ein Schätzverfahren für Positionsfehler in musterbasierten Positionierungsverfahren eingeführt, das aus den Charakteristika der durchgeführten Messungen eine Verteilung für den Standort des Nutzers ableitet. Um hieraus effizient die Aufenthaltswahrscheinlichkeit innerhalb einer autorisierten Zone zu bestimmen, wird ein Algorithmus vorgestellt, der basierend auf Vorberechnungen eine erhebliche Verbesserung der Laufzeit gegenüber der direkten Berechnung erlaubt. Erstmals wird eine umfassende Gegenüberstellung von existierenden standortbasierten Autorisierungsstrategien auf Basis der Entscheidungstheorie vorgestellt. Mit der risikobasierten Autorisierungsstrategie wird eine neue, aus entscheidungstheoretischer Sicht optimale Methodik eingeführt. Es werden Ansätze zur Erweiterung klassischer Zugriffskontrollmodelle durch Ortsbeschränkungen vorgestellt, welche bei ihrer Durchsetzung die Möglichkeit von Positionsfehlern und die Konsequenzen von Falschentscheidungen berücksichtigen. Zur Spezifikation autorisierter Zonen werden Eigenschaftsmodelle eingeführt, die, im Gegensatz zu herkömmlichen Polygonen, für jeden Ort die Wahrscheinlichkeit modellieren, dort eine geforderte Eigenschaft zu beobachten. Es werden ferner Methoden vorgestellt, um den Einfluss von Messausreißern auf Autorisierungsentscheidungen zu reduzieren. Ferner werden Analyseverfahren eingeführt, die für ein gegebenes Szenario eine qualitative und quantitative Bewertung der Eignung von Positionierungssystemen erlauben. Die quantitative Bewertung basiert auf dem entwickelten Konzept der Autorisierungsmodelle. Diese geben für jeden Standort die Wahrscheinlichkeit an, dort eine Positionsschätzung zu erhalten, die zur Autorisierung führt. Die qualitative Bewertung bietet erstmals ein binäres Kriterium, um für ein gegebenes Szenario eine konkrete Aussage bzgl. der Eignung eines Positionierungssystems treffen zu können. Die Einsetzbarkeit dieses Analyseverfahrens wird an einer Fallstudie verdeutlicht und zeigt die Notwendigkeit einer solchen Analyse bereits vor der Ausbringung von standortbasierter Autorisierung. Es wird gezeigt, dass für typische Positionierungssysteme durch die entwickelten risikobasierten Verfahren eine erhebliche Reduktion von Schaden aus Falschentscheidungen möglich ist und die Einsetzbarkeit der standortbasierten Autorisierung somit verbessert werden kann.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Real world data from various application domains can be modeled as a graph, e.g. social networks and biomedical networks like protein interaction networks or co-activation networks of brain regions. A graph is a powerful concept to model arbitrary (structural) relationships among objects. In recent years, the prevalence of social networks has made graph mining an important center of attention in the data mining field. There are many important tasks in graph mining, such as graph clustering, outlier detection, and link prediction. Many algorithms have been proposed in the literature to solve these tasks. However, normally these issues are solved separately, although they are closely related. Detecting and exploiting the relationship among them is a new challenge in graph mining. Moreover, with data explosion, more information has already been integrated into graph structure. For example, bipartite graphs contain two types of node and graphs with node attributes offer additional non-structural information. Therefore, more challenges arise from the increasing graph complexity. This thesis aims to solve these challenges in order to gain new knowledge from graph data. An important paradigm of data mining used in this thesis is the principle of Minimum Description Length (MDL). It follows the assumption: the more knowledge we have learned from the data, the better we are able to compress the data. The MDL principle balances the complexity of the selected model and the goodness of fit between model and data. Thus, it naturally avoids over-fitting. This thesis proposes several algorithms based on the MDL principle to acquire knowledge from various types of graphs: Info-spot (Automatically Spotting Information-rich Nodes in Graphs) proposes a parameter-free and efficient algorithm for the fully automatic detection of interesting nodes which is a novel outlier notion in graph. Then in contrast to traditional graph mining approaches that focus on discovering dense subgraphs, a novel graph mining technique CXprime (Compression-based eXploiting Primitives) is proposed. It models the transitivity and the hubness of a graph using structure primitives (all possible three-node substructures). Under the coding scheme of CXprime, clusters with structural information can be discovered, dominating substructures of a graph can be distinguished, and a new link prediction score based on substructures is proposed. The next algorithm SCMiner (Summarization-Compression Miner) integrates tasks such as graph summarization, graph clustering, link prediction, and the discovery of the hidden structure of a bipartite graph on the basis of data compression. Finally, a method for non-redundant graph clustering called IROC (Information-theoretic non-Redundant Overlapping Clustering) is proposed to smartly combine structural information with non-structural information based on MDL. IROC is able to detect overlapping communities within subspaces of the attributes. To sum up, algorithms to unify different learning tasks for various types of graphs are proposed. Additionally, these algorithms are based on the MDL principle, which facilitates the unification of different graph learning tasks, the integration of different graph types, and the automatic selection of input parameters that are otherwise difficult to estimate.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Partizipative Sensornetze (PSNs) stellen eine neue Art von Sensornetzen dar, die auf Basis von freiwillig zur Verfügung gestellten Mobiltelefonen etabliert werden. Sie ermöglichen eine großflächige Erfassung von Messdaten im direkten Umfeld von Menschen und können für zahlreiche Anwendungsszenarien verwendet werden. Neben ihren Vorzügen bringen PSNs aber auch Schwierigkeiten mit sich. Zwei zentrale Herausforderungen sind die ressourcenschonende Datenerfassung und der Schutz der Privatsphäre – beide resultieren aus der Instrumentalisierung privater Mobiltelefone zur Datenerfassung. Da der primäre Verwendungszweck der Geräte nicht die Aufzeichnung von Messdaten ist, darf diese deren Ressourcen nicht merklich belasten. Außerdem muss sichergestellt werden, dass durch die Erfassung von Messdaten die Privatsphäre der teilnehmenden Nutzer nicht verletzt wird. Der erste Teil der Arbeit beschäftigt sich mit dem Aspekt der ressourcenschonenden Datenerfassung. Zunächst werden PSNs betrachtet, bei denen punktuell Messungen durchgeführt werden. Bei diesen Netzen müssen die teilnehmenden Geräte über die durchzuführenden Messungen unterrichtet werden. Damit hierbei die Ressourcen der Endgeräte nicht unnötig belastet werden, wird ein Konzept vorgestellt, das einerseits eine robuste Verteilung der Messaufgaben sicherstellt, gleichzeitig jedoch versucht, die Energieressourcen der Mobiltelefone zu schonen. Bei PSNs mit großflächiger und kontinuierlicher Datenerfassung spielt die Verteilung der Messaufgaben keine so entscheidende Rolle. Hier muss vielmehr sichergestellt werden, dass die Energie- und die Übertragungskosten auf Seiten der Nutzer möglichst gering bleiben. Aus diesem Grund wird ein Ansatz zur lokalen Gruppierung von Messknoten beschrieben, der durch eine Aufteilung der anfallenden Aufgaben und eine intelligente Auswahl der Knoten zu einer ressourcenschonenden und weniger redundanten Datenerfassung führt. Der zweite Teil der Arbeit befasst sich mit dem Schutz der Privatsphäre der Teilnehmer und beinhaltet zwei Themenblöcke. Zum einen wird ein Ansatz zur automatisierten Erzeugung von Privatsphäre-Zonen vorgestellt, der ohne das Eingreifen der Nutzer die Zonen an das jeweilige Umfeld anpasst. Diese Zonen werden um die vom Nutzer häufig besuchten Orte erstellt und verhindern so mögliche, auf der Identifikation dieser Orte basierende Deanonymisierungsangriffe. Zum anderen wird ein Kalibrierungssystem für PSNs beschrieben, dessen Fokus sowohl auf der Verbesserung der Datenqualität als auch auf der Wahrung der Privatsphäre der Nutzer liegt. Hierfür ermöglicht dieses eine rückwirkende Anpassung bereits übertragener Daten, verhindert aber gleichzeitig durch eine Modifikation der Kalibrierungsparameter und der Upload-Zeitpunkte eine direkte Zuordnung zu einem Nutzer.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Compartment models are a frequently used tool for imaging data gained with medical and biological imaging techniques. The solutions of the differential equations derived from a compartment model provide nonlinear parametric functions, based on which the behavior of a concentration of interest over time can be described. Often, the number of compartments in a compartment model is unknown. As the model complexity itself, which is, the number of compartments, is certainly an important information, it is desirable to estimate it from the observed data. Additionally, the unknown parameters have to be estimated. Therefore, methods dealing with both the parameter estimation and model selection in compartment models are needed. The methods proposed in this thesis are motivated by two applications from the field of medical and biological imaging. In the first application, the quantitative analysis of Fluorescence recovery after photobleaching (FRAP) data, compartment models are used in order to gain insight into the binding behavior of molecules in living cells. As a first approach, we developed a Bayesian nonlinear mixed-effects model for the analysis of a series of FRAP images. Mixed-effect priors are defined on the parameters of the nonlinear model, which is a novel approach. With the proposed model, we get parameter estimates and additionally gain information about the variability between nuclei, which has not been studied so far. The proposed method was evaluated on half-nucleus FRAP data, also in comparison with different kinds of fixed-effects models. As a second approach, a pixelwise analysis of FRAP data is proposed, where information from the neighboring pixels is included into the nonlinear model for each pixel. This is innovative as the existing models are suitable for the analysis of FRAP data for some regions of interest only. For the second application, the quantitative analysis of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) of the breast, we use a compartment model which describes the exchange of blood between different, well-mixed compartments. In the analysis of such data, the number of compartments allows conclusions about the heterogeneity of cancerous tissue. Therefore, an estimation and model selection approach based on boosting, with which the number of compartments and the unknown parameters can be estimated at the voxel level, is proposed. In contrast to boosting for additive regression, where smoothing approaches are used, boosting in nonlinear parametric regression as described in this thesis is a novel approach. In an extension of this approach, the spatial structure of an image is taken into account by penalizing the differences in the parameter estimates of neighboring voxels. The evaluation of the method was done in simulation studies, as well as in the application to data from a breast cancer study. The majority of the program code used in the three approaches was newly developed in the programming languages R and C. Based on that code, two R packages were built.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
We present methods for improved handling of morphologically rich languages (MRLS) where we define MRLS as languages that are morphologically more complex than English. Standard algorithms for language modeling, tagging and parsing have problems with the productive nature of such languages. Consider for example the possible forms of a typical English verb like work that generally has four four different forms: work, works, working and worked. Its Spanish counterpart trabajar has 6 different forms in present tense: trabajo, trabajas, trabaja, trabajamos, trabajáis and trabajan and more than 50 different forms when including the different tenses, moods (indicative, subjunctive and imperative) and participles. Such a high number of forms leads to sparsity issues: In a recent Wikipedia dump of more than 400 million tokens we find that 20 of these forms occur only twice or less and that 10 forms do not occur at all. This means that even if we only need unlabeled data to estimate a model and even when looking at a relatively common and frequent verb, we do not have enough data to make reasonable estimates for some of its forms. However, if we decompose an unseen form such as trabajaréis `you will work', we find that it is trabajar in future tense and second person plural. This allows us to make the predictions that are needed to decide on the grammaticality (language modeling) or syntax (tagging and parsing) of a sentence. In the first part of this thesis, we develop a morphological language model. A language model estimates the grammaticality and coherence of a sentence. Most language models used today are word-based n-gram models, which means that they estimate the transitional probability of a word following a history, the sequence of the (n - 1) preceding words. The probabilities are estimated from the frequencies of the history and the history followed by the target word in a huge text corpus. If either of the sequences is unseen, the length of the history has to be reduced. This leads to a less accurate estimate as less context is taken into account. Our morphological language model estimates an additional probability from the morphological classes of the words. These classes are built automatically by extracting morphological features from the word forms. To this end, we use unsupervised segmentation algorithms to find the suffixes of word forms. Such an algorithm might for example segment trabajaréis into trabaja and réis and we can then estimate the properties of trabajaréis from other word forms with the same or similar morphological properties. The data-driven nature of the segmentation algorithms allows them to not only find inflectional suffixes (such as -réis), but also more derivational phenomena such as the head nouns of compounds or even endings such as -tec, which identify technology oriented companies such as Vortec, Memotec and Portec and would not be regarded as a morphological suffix by traditional linguistics. Additionally, we extract shape features such as if a form contains digits or capital characters. This is important because many rare or unseen forms are proper names or numbers and often do not have meaningful suffixes. Our class-based morphological model is then interpolated with a word-based model to combine the generalization capabilities of the first and the high accuracy in case of sufficient data of the second. We evaluate our model across 21 European languages and find improvements between 3% and 11% in perplexity, a standard language modeling evaluation measure. Improvements are highest for languages with more productive and complex morphology such as Finnish and Estonian, but also visible for languages with a relatively simple morphology such as English and Dutch. We conclude that a morphological component yields consistent improvements for all the tested languages and argue that it should be part of every language model. Dependency trees represent the syntactic structure of a sentence by attaching each word to its syntactic head, the word it is directly modifying. Dependency parsing is usually tackled using heavily lexicalized (word-based) models and a thorough morphological preprocessing is important for optimal performance, especially for MRLS. We investigate if the lack of morphological features can be compensated by features induced using hidden Markov models with latent annotations (HMM-LAs) and find this to be the case for German. HMM-LAs were proposed as a method to increase part-of-speech tagging accuracy. The model splits the observed part-of-speech tags (such as verb and noun) into subtags. An expectation maximization algorithm is then used to fit the subtags to different roles. A verb tag for example might be split into an auxiliary verb and a full verb subtag. Such a split is usually beneficial because these two verb classes have different contexts. That is, a full verb might follow an auxiliary verb, but usually not another full verb. For German and English, we find that our model leads to consistent improvements over a parser not using subtag features. Looking at the labeled attachment score (LAS), the number of words correctly attached to their head, we observe an improvement from 90.34 to 90.75 for English and from 87.92 to 88.24 for German. For German, we additionally find that our model achieves almost the same performance (88.24) as a model using tags annotated by a supervised morphological tagger (LAS of 88.35). We also find that the German latent tags correlate with morphology. Articles for example are split by their grammatical case. We also investigate the part-of-speech tagging accuracies of models using the traditional treebank tagset and models using induced tagsets of the same size and find that the latter outperform the former, but are in turn outperformed by a discriminative tagger. Furthermore, we present a method for fast and accurate morphological tagging. While part-of-speech tagging annotates tokens in context with their respective word categories, morphological tagging produces a complete annotation containing all the relevant inflectional features such as case, gender and tense. A complete reading is represented as a single tag. As a reading might consist of several morphological features the resulting tagset usually contains hundreds or even thousands of tags. This is an issue for many decoding algorithms such as Viterbi which have runtimes depending quadratically on the number of tags. In the case of morphological tagging, the problem can be avoided by using a morphological analyzer. A morphological analyzer is a manually created finite-state transducer that produces the possible morphological readings of a word form. This analyzer can be used to prune the tagging lattice and to allow for the application of standard sequence labeling algorithms. The downside of this approach is that such an analyzer is not available for every language or might not have the coverage required for the task. Additionally, the output tags of some analyzers are not compatible with the annotations of the treebanks, which might require some manual mapping of the different annotations or even to reduce the complexity of the annotation. To avoid this problem we propose to use the posterior probabilities of a conditional random field (CRF) lattice to prune the space of possible taggings. At the zero-order level the posterior probabilities of a token can be calculated independently from the other tokens of a sentence. The necessary computations can thus be performed in linear time. The features available to the model at this time are similar to the features used by a morphological analyzer (essentially the word form and features based on it), but also include the immediate lexical context. As the ambiguity of word types varies substantially, we just fix the average number of readings after pruning by dynamically estimating a probability threshold. Once we obtain the pruned lattice, we can add tag transitions and convert it into a first-order lattice. The quadratic forward-backward computations are now executed on the remaining plausible readings and thus efficient. We can now continue pruning and extending the lattice order at a relatively low additional runtime cost (depending on the pruning thresholds). The training of the model can be implemented efficiently by applying stochastic gradient descent (SGD). The CRF gradient can be calculated from a lattice of any order as long as the correct reading is still in the lattice. During training, we thus run the lattice pruning until we either reach the maximal order or until the correct reading is pruned. If the reading is pruned we perform the gradient update with the highest order lattice still containing the reading. This approach is similar to early updating in the structured perceptron literature and forces the model to learn how to keep the correct readings in the lower order lattices. In practice, we observe a high number of lower updates during the first training epoch and almost exclusively higher order updates during later epochs. We evaluate our CRF tagger on six languages with different morphological properties. We find that for languages with a high word form ambiguity such as German, the pruning results in a moderate drop in tagging accuracy while for languages with less ambiguity such as Spanish and Hungarian the loss due to pruning is negligible. However, our pruning strategy allows us to train higher order models (order > 1), which give substantial improvements for all languages and also outperform unpruned first-order models. That is, the model might lose some of the correct readings during pruning, but is also able to solve more of the harder cases that require more context. We also find our model to substantially and significantly outperform a number of frequently used taggers such as Morfette and SVMTool. Based on our morphological tagger we develop a simple method to increase the performance of a state-of-the-art constituency parser. A constituency tree describes the syntactic properties of a sentence by assigning spans of text to a hierarchical bracket structure. developed a language-independent approach for the automatic annotation of accurate and compact grammars. Their implementation -- known as the Berkeley parser -- gives state-of-the-art results for many languages such as English and German. For some MRLS such as Basque and Korean, however, the parser gives unsatisfactory results because of its simple unknown word model. This model maps unknown words to a small number of signatures (similar to our morphological classes). These signatures do not seem expressive enough for many of the subtle distinctions made during parsing. We propose to replace rare words by the morphological reading generated by our tagger instead. The motivation is twofold. First, our tagger has access to a number of lexical and sublexical features not available during parsing. Second, we expect the morphological readings to contain most of the information required to make the correct parsing decision even though we know that things such as the correct attachment of prepositional phrases might require some notion of lexical semantics. In experiments on the SPMRL 2013 dataset of nine MRLS we find our method to give improvements for all languages except French for which we observe a minor drop in the Parseval score of 0.06. For Hebrew, Hungarian and Basque we find substantial absolute improvements of 5.65, 11.87 and 15.16, respectively. We also performed an extensive evaluation on the utility of word representations for morphological tagging. Our goal was to reduce the drop in performance that is caused when a model trained on a specific domain is applied to some other domain. This problem is usually addressed by domain adaption (DA). DA adapts a model towards a specific domain using a small amount of labeled or a huge amount of unlabeled data from that domain. However, this procedure requires us to train a model for every target domain. Instead we are trying to build a robust system that is trained on domain-specific labeled and domain-independent or general unlabeled data. We believe word representations to be key in the development of such models because they allow us to leverage unlabeled data efficiently. We compare data-driven representations to manually created morphological analyzers. We understand data-driven representations as models that cluster word forms or map them to a vectorial representation. Examples heavily used in the literature include Brown clusters, Singular Value Decompositions of count vectors and neural-network-based embeddings. We create a test suite of six languages consisting of in-domain and out-of-domain test sets. To this end we converted annotations for Spanish and Czech and annotated the German part of the Smultron treebank with a morphological layer. In our experiments on these data sets we find Brown clusters to outperform the other data-driven representations. Regarding the comparison with morphological analyzers, we find Brown clusters to give slightly better performance in part-of-speech tagging, but to be substantially outperformed in morphological tagging.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Grasping objects with our hands allows us to skillfully move and manipulate them. Hand-held tools further extend our capabilities by adapting precision, power, and shape of our hands to the task at hand. Some of these tools, such as mobile phones or computer mice, already incorporate information processing capabilities. Many other tools may be augmented with small, energy-efficient digital sensors and processors. This allows for graspable objects to learn about the user grasping them - and supporting the user's goals. For example, the way we grasp a mobile phone might indicate whether we want to take a photo or call a friend with it - and thus serve as a shortcut to that action. A power drill might sense whether the user is grasping it firmly enough and refuse to turn on if this is not the case. And a computer mouse could distinguish between intentional and unintentional movement and ignore the latter. This dissertation gives an overview of grasp sensing for human-computer interaction, focusing on technologies for building grasp-sensitive surfaces and challenges in designing grasp-sensitive user interfaces. It comprises three major contributions: a comprehensive review of existing research on human grasping and grasp sensing, a detailed description of three novel prototyping tools for grasp-sensitive surfaces, and a framework for analyzing and designing grasp interaction: For nearly a century, scientists have analyzed human grasping. My literature review gives an overview of definitions, classifications, and models of human grasping. A small number of studies have investigated grasping in everyday situations. They found a much greater diversity of grasps than described by existing taxonomies. This diversity makes it difficult to directly associate certain grasps with users' goals. In order to structure related work and own research, I formalize a generic workflow for grasp sensing. It comprises *capturing* of sensor values, *identifying* the associated grasp, and *interpreting* the meaning of the grasp. A comprehensive overview of related work shows that implementation of grasp-sensitive surfaces is still hard, researchers often are not aware of related work from other disciplines, and intuitive grasp interaction has not yet received much attention. In order to address the first issue, I developed three novel sensor technologies designed for grasp-sensitive surfaces. These mitigate one or more limitations of traditional sensing techniques: **HandSense** uses four strategically positioned capacitive sensors for detecting and classifying grasp patterns on mobile phones. The use of custom-built high-resolution sensors allows detecting proximity and avoids the need to cover the whole device surface with sensors. User tests showed a recognition rate of 81%, comparable to that of a system with 72 binary sensors. **FlyEye** uses optical fiber bundles connected to a camera for detecting touch and proximity on arbitrarily shaped surfaces. It allows rapid prototyping of touch- and grasp-sensitive objects and requires only very limited electronics knowledge. For FlyEye I developed a *relative calibration* algorithm that allows determining the locations of groups of sensors whose arrangement is not known. **TDRtouch** extends Time Domain Reflectometry (TDR), a technique traditionally used for inspecting cable faults, for touch and grasp sensing. TDRtouch is able to locate touches along a wire, allowing designers to rapidly prototype and implement modular, extremely thin, and flexible grasp-sensitive surfaces. I summarize how these technologies cater to different requirements and significantly expand the design space for grasp-sensitive objects. Furthermore, I discuss challenges for making sense of raw grasp information and categorize interactions. Traditional application scenarios for grasp sensing use only the grasp sensor's data, and only for mode-switching. I argue that data from grasp sensors is part of the general usage context and should be only used in combination with other context information. For analyzing and discussing the possible meanings of grasp types, I created the GRASP model. It describes five categories of influencing factors that determine how we grasp an object: *Goal* -- what we want to do with the object, *Relationship* -- what we know and feel about the object we want to grasp, *Anatomy* -- hand shape and learned movement patterns, *Setting* -- surrounding and environmental conditions, and *Properties* -- texture, shape, weight, and other intrinsics of the object I conclude the dissertation with a discussion of upcoming challenges in grasp sensing and grasp interaction, and provide suggestions for implementing robust and usable grasp interaction.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
In recent years, we started to define our life through experiences we make instead of objectswe buy. To attend a concert of our favorite musician may be more important for us thanowning an expensive stereo system. Similarly, we define interactive systems not only by thequality of the display or its usability, but rather by the experiences we can make when usingthe device. A cell phone is primarily built for making calls and receiving text messages,but on an emotional level it might provide a way to be close to our loved ones, even thoughthey are far away sometimes. When designing interactive technology, we do not only haveto answer the question how people use our systems, but also why they use them. Thus,we need to concentrate on experiences, feelings and emotions arising during interaction.Experience Design is an approach focusing on the story that a product communicates beforeimplementing the system. In an interdisciplinary team of psychologists, industrial designers, product developers andspecialists in human-computer interaction, we applied an Experience Design process to theautomotive domain. A major challenge for car manufacturers is the preservation of theseexperiences throughout the development process. When implementing interactive systemsengineers rely on technical requirements and a set of constraints (e.g., safety) oftentimescontradicting aspects of the designed experience. To resolve this conflict, Experience Prototypingis an important tool translating experience stories to an actual interactive product. With this thesis I investigate the Experience Design process focusing on Experience Prototyping.Within the automotive context, I report on three case studies implementing threekinds of interactive systems, forming and following our approach. I implemented (1) anelectric vehicle information system called Heartbeat, communicating the state of the electricdrive and the batteries to the driver in an unobtrusive and ensuring way. I integrated Heartbeatinto the dashboard of a car mock-up with respect to safety and space requirements butat the same time holding on to the story in order to achieve a consistent experience. With (2)the Periscope I implemented a mobile navigation device enhancing the social and relatednessexperiences of the passengers in the car. I built and evaluated several experience prototypesin different stages of the design process and showed that they transported the designed experiencethroughout the implementation of the system. Focusing on (3) the experience offreehand gestures, GestShare explored this interaction style for in-car and car-to-car socialexperiences. We designed and implemented a gestural prototypes for small but effectivesocial interactions between drivers and evaluated the system in the lab and and in-situ study. The contributions of this thesis are (1) a definition of Experience Prototyping in the automotivedomain resulting from a literature review and my own work, showing the importanceand feasibility of Experience Prototyping for Experience Design. I (2) contribute three casestudies and describe the details of several prototypes as milestones on the way from a anexperience story to an interactive system. I (3) derive best practices for Experience Prototypingconcerning their characteristics such as fidelity, resolution and interactivity as well asthe evaluation in the lab an in situ in different stages of the process.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
In today’s business world, decisions have to be made on different levels, including strategic, tactical, and operational levels. Decisions on the strategic level are characterized by their complexity, longevity and impact. Such decisions can benefit from the participation of a large, diverse group of people as they contribute different background knowledge, perspectives, and evaluation criteria. Typically, such decisions need to be considered over a prolonged period of time as opinions may need to be formed or may change due to the availability of new information. The goal of people in group decision making situations is typically to achieve good decisions. A mechanism is thus desirable that is capable of addressing the aforementioned challenges and of producing a good decision. For this work, a decision is thought to be good if it is predominantly based on the sincere opinions of the participants. In this thesis, we investigate the market metaphor as a promising approach for group decision making. Markets are attributed with the capability of gathering and aggregating assessments from people in a single indicator, the price. They allow for a continued participation over a prolonged time, reversibility of one’s market position by repeated trading, and the usage of individual evaluation criteria. For investigating the application of the market metaphor to decision making, we develop LDM, a market-based approach for group decision making. There, we represent a pending decision as a market and the decision options as stocks. Participants then buy shares of their favored stocks and sell shares of the stocks they dislike. High demand leads to price increase whereas low prices are the result of low demand. The most favored decision options can be identified from the ranking of the stocks according to their prices. To support the achievement of a good decision, we model the market behavior of participants, devise design principles, identify suitable application scenarios, and determine appropriate functionalities for a market software. We furthermore devise the concept of market perturbations for uncovering the trading intentions of participants. We furthermore implement a web-based software prototype of LDM. It provides functionalities for decision making, market trading, user handling, information exchange, and market perturbations. Participants there trade their favored stocks using virtual play money. We test the LDM approach and its software prototype in an EU-funded project, in a lab study, in the selection of research proposals, and in a university seminar for scenario building.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Penalisierte Regressionsmodelle stellen eine Möglichkeit dar die Selektion von Kovariablen in die Schätzung eines Modells zu integrieren. Penalisierte Ansätze eignen sich insbesondere dafür, komplexen Strukturen in den Kovariablen eines Modells zu berücksichtigen. Diese Arbeit beschäftigt sich mit verschiedenen Penalisierungsansätzen für diskrete Strukturen, wobei der Begriff "diskrete Struktur" in dieser Arbeit alle Arten von kategorialen Einflussgrößen, von effekt-modifizierenden, kategorialen Einflussgrößen sowie von gruppenspezifischen Effekten in hierarchisch strukturierten Daten bezeichnet. Ihnen ist gemein, dass sie zu einer verhältnismäßig großen Anzahl an zu schätzenden Koeffizienten führen können. Deswegen besteht ein besonderes Interesse daran zu erfahren, welche Kategorien einer Einflussgröße die Zielgröße beeinflussen, und welche Kategorien unterschiedliche beziehungsweise ähnliche Effekte auf die Zielgröße haben. Kategorien mit ähnlichen Effekten können beispielsweise durch fused Lasso Penalties identifiziert werden. Jedoch beschränken sich einige, bestehende Ansätze auf das lineare Modell. Die vorliegende Arbeit überträgt diese Ansätze auf die Klasse der generalisierten linearen Regressionsmodelle. Das beinhaltet computationale wie theoretische Aspekte. Konkret wird eine fused Lasso Penalty für effekt-modifizierende kategoriale Einflussgrößen in generalisierten linearen Regressionsmodellen vorgeschlagen. Sie ermöglicht es, Einflussgrößen zu selektieren und Kategorien einer Einflussgröße zu fusionieren. Gruppenspezifische Effekte, die die Heterogenität in hierarchisch strukturierten Daten berücksichtigen, sind ein Spezialfall einer solchen effekt-modifizierenden, kategorialen Größe. Hier bietet der penalisierte Ansatz zwei wesentliche Vorteile: (i) Im Gegensatz zu gemischten Modellen, die stärkere Annahmen treffen, kann der Grad der Heterogenität sehr leicht reduziert werden. (ii) Die Schätzung ist effizienter als im unpenalisierten Ansatz. In orthonormalen Settings können Fused Lasso Penalties konzeptionelle Nachteile haben. Als Alternative wird eine L0 Penalty für diskrete Strukturen in generalisierten linearen Regressionsmodellen diskutiert, wobei die sogenannte L0 "Norm" eine Indikatorfunktion für Argumente ungleich Null bezeichnet. Als Penalty ist diese Funktion so interessant wie anspruchsvoll. Betrachtet man eine Approximation der L0 Norm als Verlustfunktion wird im Grenzwert der bedingte Modus einer Zielgröße geschätzt.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
The combination of survey data with more objective information, such as administrative records, is a promising innovation within social science research. The advantages of such projects are manifold, but implementing them also bears challenges to be considered. For example, the survey respondents typically have to consent to the linking of external data sources and interviewers have to feel comfortable with this task. This dissertation investigates whether and to what extent the interviewers have an influence on the willingness of the respondents to participate in two new projects within the Survey of Health, Ageing and Retirement in Europe (SHARE). Both projects had the goal to reduce the burden for respondents and to increase the data quality by linking the survey data with additional, more objective data. Both linkages required the interviewers to collect respondents’ written consent during the interview. The starting point of this dissertation is the question of what influences respondents’ decisions to consent to link their survey answers with administrative data. Three different areas are considered: characteristics of the respondents, the interviewers, and the interaction between respondents and interviewers. The results suggest that although respondent and household characteristics are important, a large part of the variation is explained by the interviewers. However, the information available about interviewers in SHARE is limited to a few demographic characteristics. Therefore, it is difficult to identify key interviewer characteristics that influence the consent process. To close this research gap, a detailed interviewer survey was developed and implemented in SHARE. This survey covers four different dimensions of interviewer characteristics: interviewers’ attitudes, their own behavior, experiences in surveys and special measurements, and their expectations regarding their success. These dimensions are applied to several aspects of the survey process, such as unit or item nonresponse as well as the specific projects of the corresponding SHARE questionnaire. The information collected in the interviewer survey is then used to analyze interviewer effects on respondents’ willingness to consent to the collection of blood samples. Those samples are analyzed in a laboratory and the results linked with the survey data. Interviewers’ experience and their expectations are of special interest, because as these are two characteristics that can be influenced during interviewer training and selection. The results in this dissertation show that the interviewers have a considerable effect on respondents’ consent to the collection of biomarkers. Moreover, the information collected in the interviewer survey can explain most of the variance on the interviewer level. A motivation for linking survey data with more objective data is the assumption that survey data suffer from recall error. In the last step, the overlap of information collected in the survey and provided in the administrative records is used to analyze recall error in the year of retirement. The comparison of the two datasets shows that most of respondents remember the year they retired correctly. Nevertheless, a considerable proportion of respondents make recall errors. Characteristics can be identified which increase the likelihood of a misreport, However, the error seems to be unsystematic, meaning that no pattern of reporting the event of retirement too late or too early is found.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Network quality of service (QoS) and its management are concerned with providing, guaranteeing and reporting properties of data flows within computer networks. For the past two decades, virtualization has been becoming a very popular tool in data centres, yet, without network QoS management capabilities. With virtualization, the management focus shifts from physical components and topologies, towards virtual infrastructures (VI) and their purposes. VIs are designed and managed as independent isolated entities. Without network QoS management capabilities, VIs cannot offer the same services and service levels as physical infrastructures can, leaving VIs at a disadvantage with respect to applicability and efficiency. This thesis closes this gap and develops a management architecture, enabling network QoS management in virtulization environments. First, requirements are dervied, based on real world scenarios, yielding a validation reference for the proposed architecture. After that, a life cycle for VIs and a taxonomy for network links and virtual components are introduced, to arrange the network QoS management task with the general management of virtualization environments and enabling the creation of technology specific adaptors for integrating the technologies and sub-services used in virtualization environments. The core aspect, shaping the proposed management architecture, is a management loop and its corresponding strategy for identifying and ordering sub-tasks. Finally, a prototypical implementation showcases that the presented management approach is suited for network QoS management and enforcement in virtualization environments. The architecture fulfils its purpose, fulfilling all identified requirements. Ultimately, network QoS management is one amongst many aspects to management in virtualization environments and the herin presented architecture shows interfaces to other management areas, where integration is left as future work.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Due to the increasing power of data acquisition and data storage technologies, a large amount of data sets with complex structure are collected in the era of data explosion. Instead of simple representations by low-dimensional numerical features, such data sources range from high-dimensional feature spaces to graph data describing relationships among objects. Many techniques exist in the literature for mining simple numerical data but only a few approaches touch the increasing challenge of mining complex data, such as high-dimensional vectors of non-numerical data type, time series data, graphs, and multi-instance data where each object is represented by a finite set of feature vectors. Besides, there are many important data mining tasks for high-dimensional data, such as clustering, outlier detection, dimensionality reduction, similarity search, classification, prediction and result interpretation. Many algorithms have been proposed to solve these tasks separately, although in some cases they are closely related. Detecting and exploiting the relationships among them is another important challenge. This thesis aims to solve these challenges in order to gain new knowledge from complex high-dimensional data. We propose several new algorithms combining different data mining tasks to acquire novel knowledge from complex high-dimensional data: ROCAT (Relevant Overlapping Subspace Clusters on Categorical Data) automatically detects the most relevant overlapping subspace clusters on categorical data. It integrates clustering, feature selection and pattern mining without any input parameters in an information theoretic way. The next algorithm MSS (Multiple Subspace Selection) finds multiple low-dimensional subspaces for moderately high-dimensional data, each exhibiting an interesting cluster structure. For better interpretation of the results, MSS visualizes the clusters in multiple low-dimensional subspaces in a hierarchical way. SCMiner (Summarization-Compression Miner) focuses on bipartite graph data, which integrates co-clustering, graph summarization, link prediction, and the discovery of the hidden structure of a bipartite graph data on the basis of data compression. Finally, we propose a novel similarity measure for multi-instance data. The Probabilistic Integral Metric (PIM) is based on a probabilistic generative model requiring few assumptions. Experiments demonstrate the effectiveness and efficiency of PIM for similarity search (multi-instance data indexing with M-tree), explorative data analysis and data mining (multi-instance classification). To sum up, we propose algorithms combining different data mining tasks for complex data with various data types and data structures to discover the novel knowledge hidden behind the complex data.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Data intensive applications like biology, medicine, and neuroscience require effective and efficient data mining technologies. Advanced data acquisition methods produce a constantly increasing volume and complexity. As a consequence, the need of new data mining technologies to deal with complex data has emerged during the last decades. In this thesis, we focus on the data mining task of clustering in which objects are separated in different groups (clusters) such that objects inside a cluster are more similar than objects in different clusters. Particularly, we consider density-based clustering algorithms and their applications in biomedicine. The core idea of the density-based clustering algorithm DBSCAN is that each object within a cluster must have a certain number of other objects inside its neighborhood. Compared with other clustering algorithms, DBSCAN has many attractive benefits, e.g., it can detect clusters with arbitrary shape and is robust to outliers, etc. Thus, DBSCAN has attracted a lot of research interest during the last decades with many extensions and applications. In the first part of this thesis, we aim at developing new algorithms based on the DBSCAN paradigm to deal with the new challenges of complex data, particularly expensive distance measures and incomplete availability of the distance matrix. Like many other clustering algorithms, DBSCAN suffers from poor performance when facing expensive distance measures for complex data. To tackle this problem, we propose a new algorithm based on the DBSCAN paradigm, called Anytime Density-based Clustering (A-DBSCAN), that works in an anytime scheme: in contrast to the original batch scheme of DBSCAN, the algorithm A-DBSCAN first produces a quick approximation of the clustering result and then continuously refines the result during the further run. Experts can interrupt the algorithm, examine the results, and choose between (1) stopping the algorithm at any time whenever they are satisfied with the result to save runtime and (2) continuing the algorithm to achieve better results. Such kind of anytime scheme has been proven in the literature as a very useful technique when dealing with time consuming problems. We also introduced an extended version of A-DBSCAN called A-DBSCAN-XS which is more efficient and effective than A-DBSCAN when dealing with expensive distance measures. Since DBSCAN relies on the cardinality of the neighborhood of objects, it requires the full distance matrix to perform. For complex data, these distances are usually expensive, time consuming or even impossible to acquire due to high cost, high time complexity, noisy and missing data, etc. Motivated by these potential difficulties of acquiring the distances among objects, we propose another approach for DBSCAN, called Active Density-based Clustering (Act-DBSCAN). Given a budget limitation B, Act-DBSCAN is only allowed to use up to B pairwise distances ideally to produce the same result as if it has the entire distance matrix at hand. The general idea of Act-DBSCAN is that it actively selects the most promising pairs of objects to calculate the distances between them and tries to approximate as much as possible the desired clustering result with each distance calculation. This scheme provides an efficient way to reduce the total cost needed to perform the clustering. Thus it limits the potential weakness of DBSCAN when dealing with the distance sparseness problem of complex data. As a fundamental data clustering algorithm, density-based clustering has many applications in diverse fields. In the second part of this thesis, we focus on an application of density-based clustering in neuroscience: the segmentation of the white matter fiber tracts in human brain acquired from Diffusion Tensor Imaging (DTI). We propose a model to evaluate the similarity between two fibers as a combination of structural similarity and connectivity-related similarity of fiber tracts. Various distance measure techniques from fields like time-sequence mining are adapted to calculate the structural similarity of fibers. Density-based clustering is used as the segmentation algorithm. We show how A-DBSCAN and A-DBSCAN-XS are used as novel solutions for the segmentation of massive fiber datasets and provide unique features to assist experts during the fiber segmentation process.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Prediction problems on high-dimensional molecular data, e.g. the classification of microar- ray samples into normal and cancer tissues, are complex and ill-posed since the number of variables usually exceeds the number of observations by orders of magnitude. Recent research in the area has propagated a variety of new statistical models in order to handle these new biological datasets. In practice, however, these models are always applied in combination with preprocessing and variable selection methods as well as model selection which is mostly performed by cross-validation. Varma and Simon (2006) have used the term ‘wrapper-algorithm’ for this integration of preprocessing and model selection into the construction of statistical models. Additionally, they have proposed the method of nested cross-validation (NCV) as a way of estimating their prediction error which has evolved to the gold-standard by now. In the first part, this thesis provides further theoretical and empirical justification for the usage of NCV in the context of wrapper-algorithms. Moreover, a computationally less intensive alternative to NCV is proposed which can be motivated in a decision theoretic framework. The new method can be interpreted as a smoothed variant of NCV and, in contrast to NCV, guarantees intuitive bounds for the estimation of the prediction error. The second part focuses on the ranking of wrapper algorithms. Cross-study-validation is proposed as an alternative concept to the repetition of separated within-study-validations if several similar prediction problems are available. The concept is demonstrated using six different wrapper algorithms for survival prediction on censored data on a selection of eight breast cancer datasets. Additionally, a parametric bootstrap approach for simulating realistic data from such related prediction problems is described and subsequently applied to illustrate the concept of cross-study-validation for the ranking of wrapper algorithms. Eventually, the last part approaches computational aspects of the analyses and simula- tions performed in the thesis. The preprocessing before the analysis as well as the evaluation of the prediction models requires the usage of large computing resources. Parallel comput- ing approaches are illustrated on cluster, cloud and high performance computing resources using the R programming language. Usage of heterogeneous hardware and processing of large datasets are covered as well as the implementation of the R-package survHD for the analysis and evaluation of high-dimensional wrapper algorithms for survival prediction from censored data.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
In heutigen Premiumfahrzeugen kommunizieren bis zu 80 Steuergeräte über bis zu sechs verschiedene Vernetzungstechnologien. Dabei öffnet sich die Fahrzeugkommunikation nach außen: Das Fahrzeug kommuniziert auch mit dem Smartphone des Fahrers und dem Internet. Für die Kommunikation über verschiedene Anwendungsdomänen im Fahrzeug müssen heute Gateways eingesetzt werden, die zwischen den nicht-kompatiblen Protokollen übersetzen. Deswegen geht der Trend auch in der Fahrzeugkommunikation zum Internet Protocol (IP), das für technologie- und domänenübergreifende Kommunikation entwickelt wurde. Neben dem durchgängigen Protokoll auf der Vermittlungsschicht ist für die effiziente Entwicklung eines komplexen, verteilten Systems wie einem Fahrzeug auch eine entsprechende Kommunikationsmiddleware notwendig. Die Kommunikation in einem Fahrzeug stellt spezielle Anforderungen an die Kommunikationsmiddleware. Zum einen werden in Fahrzeugen unterschiedliche Kommunikationsparadigmen genutzt, beispielsweise signalbasierte und funktionsbasierte Kommunikation. Zum anderen können sich die Kommunikationspartner in einem Fahrzeug hinsichtlich ihrer Ressourcen und ihrer Komplexität erheblich unterscheiden. Keine existierende IP-basierte Kommunikationsmiddleware erfüllt die in der vorliegenden Arbeit identifizierten Anforderungen für den Einsatz im Fahrzeug. Ziel dieser Arbeit ist es daher, eine Kommunikationsmiddleware zu konzipieren, die für den Einsatz im Fahrzeug geeignet ist. Die vorgestellte Lösung sieht mehrere interoperable Ausprägungen der Middleware vor, die den Konflikt zwischen unterschiedlichen funktionalen Anforderungen einerseits und den sehr heterogenen Kommunikationspartnern andererseits auflösen. Ein weiterer elementarer Teil der Lösung ist die Umsetzung der im Fahrzeug erforderlichen Kommunikationsparadigmen. Das funktionsbasierte Paradigma wird durch einfache Remote Procedure Calls implementiert. Das signalbasierte Paradigma wird durch ein darauf aufbauendes Notification-Konzept implementiert. Somit wird eine stärker am aktuellen Informationsbedarf orientierte Umsetzung ermöglicht, als dies im heutigen Fahrzeugbordnetz durch das einfache Verteilen von Daten der Fall ist. Es wird gezeigt, dass sich prinzipiell beide Kommunikationsparadigmen durch einen einzigen Mechanismus abbilden lassen, der abhängig von den beteiligten Ausprägungen mit dynamischen oder nur statischen Daten operiert. Ein skalierbares Marshalling berücksichtigt darüber hinaus die unterschiedlichen Anforderungen der Anwendungen und die unterschiedliche Leistungsfähigkeit der beteiligten Steuergeräte. Hiermit wird die Kommunikation zwischen allen Anwendungen im IP-basierten Fahrzeugbordnetz durchgängig ermöglicht. Auf dieser Basis wird die Lösung um wichtige Systemdienste erweitert. Diese Dienste implementieren Funktionen, die nur in der Kooperation mehrerer Komponenten erbracht werden können oder kapseln allgemeine Kommunikationsfunktionalität zur einfachen Wiederverwendung. Zwei für die Anwendung im Fahrzeug wichtige Systemdienste werden prototypisch dargestellt: Ein Service-Management ermöglicht die Verwaltung von Diensten in unterschiedlichen Zuständen, ein Security-Management bildet Security-Ziele auf die bestmögliche Kombination von implementierten Security-Protokollen der beteiligten Kommunikationspartner ab. Diese Systemdienste sind selbst skalierbar und lassen sich damit an das Konzept unterschiedlicher Ausprägungen der Kommunikationsmiddleware anpassen. Durch Leistungsmessungen an den im Rahmen dieser Arbeit entstandenen Prototypen wird gezeigt, dass die konzipierte Kommunikationsmiddleware für den Einsatz auf eingebetteten Systemen im Fahrzeug geeignet ist. Der Versuchsaufbau orientiert sich an typischen Anwendungsfällen für die Fahrzeugkommunikation und verwendet Automotive-qualifizierte, eingebettete Rechenplattformen. Insbesondere wird nachgewiesen, dass mit dem beschriebenen Konzept auch leistungsschwache Steuergeräte ins System eingebunden werden können. Die IP-basierte Kommunikationsmiddleware ist damit auf allen relevanten Steuergeräten im Fahrzeug durchgängig einsetzbar.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Interactive surfaces are a growing trend in many domains. As one possible manifestation of Mark Weiser’s vision of ubiquitous and disappearing computers in everywhere objects, we see touchsensitive screens in many kinds of devices, such as smartphones, tablet computers and interactive tabletops. More advanced concepts of these have been an active research topic for many years. This has also influenced automotive cockpit development: concept cars and recent market releases show integrated touchscreens, growing in size. To meet the increasing information and interaction needs, interactive surfaces offer context-dependent functionality in combination with a direct input paradigm. However, interfaces in the car need to be operable while driving. Distraction, especially visual distraction from the driving task, can lead to critical situations if the sum of attentional demand emerging from both primary and secondary task overextends the available resources. So far, a touchscreen requires a lot of visual attention since its flat surface does not provide any haptic feedback. There have been approaches to make direct touch interaction accessible while driving for simple tasks. Outside the automotive domain, for example in office environments, concepts for sophisticated handling of large displays have already been introduced. Moreover, technological advances lead to new characteristics for interactive surfaces by enabling arbitrary surface shapes. In cars, two main characteristics for upcoming interactive surfaces are largeness and shape. On the one hand, spatial extension is not only increasing through larger displays, but also by taking objects in the surrounding into account for interaction. On the other hand, the flatness inherent in current screens can be overcome by upcoming technologies, and interactive surfaces can therefore provide haptically distinguishable surfaces. This thesis describes the systematic exploration of large and shaped interactive surfaces and analyzes their potential for interaction while driving. Therefore, different prototypes for each characteristic have been developed and evaluated in test settings suitable for their maturity level. Those prototypes were used to obtain subjective user feedback and objective data, to investigate effects on driving and glance behavior as well as usability and user experience. As a contribution, this thesis provides an analysis of the development of interactive surfaces in the car. Two characteristics, largeness and shape, are identified that can improve the interaction compared to conventional touchscreens. The presented studies show that large interactive surfaces can provide new and improved ways of interaction both in driver-only and driver-passenger situations. Furthermore, studies indicate a positive effect on visual distraction when additional static haptic feedback is provided by shaped interactive surfaces. Overall, various, non-exclusively applicable, interaction concepts prove the potential of interactive surfaces for the use in automotive cockpits, which is expected to be beneficial also in further environments where visual attention needs to be focused on additional tasks.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Semantic search engines aim at improving conventional search with semantic information, or meta-data, on the data searched for and/or on the searchers. So far, approaches to semantic search exploit characteristics of the searchers like age, education, or spoken language for selecting and/or ranking search results. Such data allow to build up a semantic search engine as an extension of a conventional search engine. The crawlers of well established search engines like Google, Yahoo! or Bing can index documents but, so far, their capabilities to recognize the intentions of searchers are still rather limited. Indeed, taking into account characteristics of the searchers considerably extend both, the quantity of data to analyse and the dimensionality of the search problem. Well established search engines therefore still focus on general search, that is, "search for all", not on specialized search, that is, "search for a few". This thesis reports on techniques that have been adapted or conceived, deployed, and tested for building a semantic search engine for the very specific context of artworks. In contrast to, for example, the interpretation of X-ray images, the interpretation of artworks is far from being fully automatable. Therefore artwork interpretation has been based on Human Computation, that is, a software-based gathering of contributions by many humans. The approach reported about in this thesis first relies on so called Games With A Purpose, or GWAPs, for this gathering: Casual games provide an incentive for a potentially unlimited community of humans to contribute with their appreciations of artworks. Designing convenient incentives is less trivial than it might seem at first. An ecosystem of games is needed so as to collect the meta-data on artworks intended for. One game generates the data that can serve as input of another game. This results in semantically rich meta-data that can be used for building up a successful semantic search engine. Thus, a first part of this thesis reports on a "game ecosystem" specifically designed from one known game and including several novel games belonging to the following game classes: (1) Description Games for collecting obvious and trivial meta-data, basically the well-known ESP (for extra-sensorial perception) game of Luis von Ahn, (2) the Dissemination Game Eligo generating translations, (3) the Diversification Game Karido aiming at sharpening differences between the objects, that is, the artworks, interpreted and (3) the Integration Games Combino, Sentiment and TagATag that generate structured meta-data. Secondly, the approach to building a semantic search engine reported about in this thesis relies on Higher-Order Singular Value Decomposition (SVD). More precisely, the data and meta-data on artworks gathered with the afore mentioned GWAPs are collected in a tensor, that is a mathematical structure generalising matrices to more than only two dimensions, columns and rows. The dimensions considered are the artwork descriptions, the players, and the artwork themselves. A Higher-Order SVD of this tensor is first used for noise reduction in This thesis reports also on deploying a Higher-Order LSA. The parallel Higher-Order SVD algorithm applied for the Higher-Order LSA and its implementation has been validated on an application related to, but independent from, the semantic search engine for artworks striven for: image compression. This thesis reports on the surprisingly good image compression which can be achieved with Higher-Order SVD. While compression methods based on matrix SVD for each color, the approach reported about in this thesis relies on one single (higher-order) SVD of the whole tensor. This results in both, better quality of the compressed image and in a significant reduction of the memory space needed. Higher-Order SVD is extremely time-consuming what calls for parallel computation. Thus, a step towards automatizing the construction of a semantic search engine for artworks was parallelizing the higher-order SVD method used and running the resulting parallel algorithm on a super-computer. This thesis reports on using Hestenes’ method and R-SVD for parallelising the higher-order SVD. This method is an unconventional choice which is explained and motivated. As of the super-computer needed, this thesis reports on turning the web browsers of the players or searchers into a distributed parallel computer. This is done by a novel specific system and a novel implementation of the MapReduce data framework to data parallelism. Harnessing the web browsers of the players or searchers saves computational power on the server-side. It also scales extremely well with the number of players or searchers because both, playing with and searching for artworks, require human reflection and therefore results in idle local processors that can be brought together into a distributed super-computer.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Computer security is a very technical topic that is in many cases hard to grasp for the average user. Especially when using the Internet, the biggest network connecting computers globally together, security and safety are important. In many cases they can be achieved without the user's active participation: securely storing user and customer data on Internet servers is the task of the respective company or service provider, but there are also a lot of cases where the user is involved in the security process, especially when he or she is intentionally attacked. Socially engineered phishing attacks are such a security issue were users are directly attacked to reveal private data and credentials to an unauthorized attacker. These types of attacks are the main focus of the research presented within my thesis. I have a look at how these attacks can be counteracted by detecting them in the first place but also by mediating these detection results to the user. In prior research and development these two areas have most often been regarded separately, and new security measures were developed without taking the final step of interacting with the user into account. This interaction mainly means presenting the detection results and receiving final decisions from the user. As an overarching goal within this thesis I look at these two aspects united, stating the overall protection as the sum of detection and "user intervention". Within nine different research projects about phishing protection this thesis gives answers to ten different research questions in the areas of creating new phishing detectors (phishing detection) and providing usable user feedback for such systems (user intervention): The ten research questions cover five different topics in both areas from the definition of the respective topic over ways how to measure and enhance the areas to finally reasoning about what is making sense. The research questions have been chosen to cover the range of both areas and the interplay between them. They are mostly answered by developing and evaluating different prototypes built within the projects that cover a range of human-centered detection properties and evaluate how well these are suited for phishing detection. I also take a look at different possibilities for user intervention (e.g. how should a warning look like? should it be blocking or non-blocking or perhaps even something else?). As a major contribution I finally present a model that combines phishing detection and user intervention and propose development and evaluation recommendations for similar systems. The research results show that when developing security detectors that yield results being relevant for end users such a detector can only be successful in case the final user feedback already has been taken into account during the development process.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Cloud computing is a paradigm for using IT services with characteristics such as flexible and scalable service usage, on-demand availability, and pay-as-you-go billing. Respective services are called cloud services and their nature usually motivates a differentiation in three layers: Infrastructure as a Service (IaaS) for cloud services offering functionality of hardware resources in a virtualised way, Platform as a Service (PaaS) for services acting as execution platforms, and Software as a Service (SaaS) representing applications provided in a cloud computing way. Any of these services is offered with the illusion of unlimited scalability. The infinity gained by this illusion implies the need for some kind of regulation mechanism to manage sup- ply and demand. Today’s static pricing mechanisms are limited in their capabilities to adapt to dynamic characteristics of cloud environments such as changing workloads. The solution is a dy- namic pricing approch compareable to today’s exchanges. This requires comparability of cloud services and the need of standardised access to avoid vendor lock-in. To achieve comparability, a classification for cloud services is introcuced, where classes of cloud services representing tradable goods are expressed by the minimum requirements for a certain class. The main result of this work is a framework for exchange-based trading of cloud com- puting commodities, which is composed of four core components derived from existing ex- change market places. An exchange component takes care of accepting orders from buyers and sellers and determines the price for the goods. A clearing component is responsible for the fi- nancial closing of a trade. The settlement component takes care of the delivery of the cloud service. A rating component monitors cloud services and logs service level agreement breaches to calculate provider ratings, especially for reliability, which is an important factor in cloud computing. The framework establishes a new basis for using cloud services and more advanced business models. Additionally, an overview of selected economic aspects including ideas for derivative financial instruments like futures, options, insurances, and more complex ones is pro- vided. A first version of the framework is currently being implemented and in use at Deutsche Bo ̈rse Cloud Exchange AG.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Crowdsourcing denotes the transfer of work commonly carried out by single humans to a large group of people. Nowadays, crowdsourcing is employed for many purposes, like people contributing their knowledge to Wikipedia, researchers predicting diseases from data on Twitter, or players solving protein folding problems in games. Still, there are areas for which the application of crowdsourcing has not yet been investigated thoroughly. This thesis examines crowdsourcing for two such areas: for empirical research in sciences oriented on humans -focusing on linguistic field research- and for e-learning. Sciences oriented on humans -like linguistics, sociology, or art history- depend on empirical research. For example, in traditional linguistic field research researchers ask questions and fill in forms. Such methods are time-consuming, costly, and not free of biases. This thesis proposes the application of crowdsourcing techniques to overcome these disadvantages and to support empirical research in getting more efficient. Therefore, the concept of a generic market for trading with symbolic goods and speculating on their characteristics in a playful manner, called Agora is introduced. Agora aims to be an "operating system" for social media applications gathering data. Furthermore, the Web-based crowdsourcing platform metropolitalia has been established for hosting two social media applications based upon Agora: Mercato Linguistico and Poker Parole. These applications have been conceived as part of this thesis for gathering complementary data and meta-data on Italian language varieties. Mercato Linguistico incites players to express their own knowledge or beliefs, Poker Parole incites players to make conjectures on the contributions of others. Thereby the primary meta-data collected with Mercato Linguistico are enriched with secondary, reflexive meta-data from Poker Parole, which are needed for studies on the perception of languages. An evaluation of the data gathered on metropolitalia exhibits the viability of the market-based approach of Agora and highlights its strengths. E-learning is concerned with the use of digital technology for learning, nowadays especially via the Internet. This thesis investigates how e-learning applications can support students with association-based learning and lecturers with teaching. For that, a game-like e-learning tool named Termina is proposed in this thesis. From the data collected with Termina association maps are constructed. An association map is a simplified version of a concept map, in which concepts are represented as rectangles and relationships between concepts as links. They constitute an abstract comprehension of a topic. Students profit from the association maps' availability, learn from other participating students, and can track their own learning progress. Lecturers gain insights into the knowledge and into potential misunderstandings of their students. An evaluation of Termina and the collected data along a university course exhibits Termina's usefulness for both students and lecturers. The main contributions of this thesis are (1) a literature review over collective intelligence, crowdsourcing, and related fields, (2) a model of a generic market for gathering data for empirical research efficiently, (3) two applications based on this model and results of an evaluation of the data gathered with them, (4) the game-like e-learning tool Termina together with insights from its evaluation, and (5) a generic software architecture for all aforementioned applications.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Cloud Computing bietet dem Nutzer die Möglichkeit, virtualisierte Applikationen, Plattformen und sogar Hardware wie Dienste zu nutzen. Der bereitstellende Provider kann die nötige Dienstkapazität in der Cloud elastisch an den aktuellen Bedarf anpassen. Durch die beliebige Ressourcenzuschaltung erwächst jedoch eine erhebliche ökologische und ökonomische Verantwortung. Zudem wird meist teuer überprovisioniert, um im Falle von Lastspitzen das Ablehnen von Anfragen zu vermeiden. Bis heute bietet noch kein Anbieter eine voll automatisierte Lösung zum optimalen Ressourcenmanagement an. Zur Gewährleistung der Skalierbarkeit eines Cloud-basierten Systems bestehen lediglich regelbasierte Mechanismen. Der Nutzer muss die Logik manuell einstellen, um die benötigte Anzahl an Maschinen und ihre Lokalität festzulegen. So sind viele Fragen zu der Ressourcenverwaltung und den entstehenden Kosten für den Cloud-Anwender ungelöst. In dieser Arbeit wird ein Protokoll entwickelt, das eine verbesserte Reihenfolge der Anfragenbearbeitung festlegt und die nötige Ressourcenmenge bestimmt. Die Ressourcenzuteilung basiert zum einen auf der Bedarfsreservierung durch die Ressourcen. Zum anderen kann das Nutzungsverhalten durch den Dienst beeinflusst werden. Die Simulationsergebnisse zeigen so stark reduzierte Antwortzeiten und Verwurfsraten. Im Cloud-Umfeld ist die effiziente Bearbeitung der Anfragen allerdings häufig aufgrund von Abhängigkeiten zwischen den dienstbetreibenden Maschinen beeinträchtigt. Deshalb wird das Protokoll um einen Selbstkalibrierungsmechanismus und ein Ressourcenregelverfahren erweitert. Mit der Abbildung der Abhängigkeiten auf einen Graphen ist das Gesamtsystem skalierbar, ohne die Auslastung aller Ressourcen einzeln kontrollieren zu müssen. Vom menschlichen Benutzer kann man jedoch keine Vorabreservierung bezüglich des Zeitpunkts seiner Dienstnutzung fordern - so wie das von Maschinen problemlos möglich ist. Für diesen Fall ermöglicht die vorliegende Arbeit deshalb die Extrapolation der Nutzerdaten aus Aufzeichnungen sozialer Netzwerke. Ohne Belastung der Anwender wird die Identifikation des Ressourcenbedarfs an einem bestimmten Ort realisiert. Für eine solche Systemadaption führt der in dieser Arbeit entworfene Algorithmus eine ortsspezifische Vorhersagende der nötigen Ressourcenanzahl durch. Diese Informationen dienen als Input für das entwickelte Protokoll und bieten so eine wohlproportionierte Provisionierung. Die bei Skalierungen entstehenden Kosten sind meist schwer abzuschätzen. Aus diesem Grund werden im Verlauf dieser Arbeit Kostenfunktionen für den Nutzer und den Anbieter erstellt. Sie machen das optimale Mittel zwischen geringeren Kosten bei niedriger Ressourcenmenge und höherer Nutzerzufriedenheit bei großzügiger Kapazitätsabdeckung berechenbar. Eine prototypische Umsetzung einschließlich verschiedener Simulationen zeigt, dass die entwickelten Ansätze ein deutlich verbessertes automatisiertes Ressourcenmanagement umsetzen.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
In our everyday life we carry out a multitude of activities in parallel without focusing our attention explicitly on them. We drink a cup of tea while reading a book, we signal a colleague passing by with a hand gesture, that we are concentrated right now and that he should wait one moment, or we walk a few steps backwards while taking photos. Many of these interactions - like drinking, sending signals via gestures or walking - are rather complex by themselves. By means of learning and training, however, these interactions become part of our routines and habits and therefore only consume little or no attentional resources. In contrast, when interacting with digital devices, we are often asked for our full attention. To carry out - even small and marginal tasks - we are regularly forced to switch windows, do precise interactions (e.g., pointing with the mouse) and thereby these systems trigger context and focus switches, disrupting us in our main focus and task. Peripheral interaction aims at making use of human capabilities and senses like divided attention, spatial memory and proprioception to support interaction with digital devices in the periphery of the attention, consequently quasi-parallel to another primary task. In this thesis we investigate peripheral interaction in the context of a standard desktop computer environment. We explore three interaction styles for peripheral interaction: graspable interaction, touch input and freehand gestures. StaTube investigates graspable interaction in the domain of instant messaging, while the Appointment Projection uses simple wiping gestures to access information about upcoming appointments. These two explorations focus on one interaction style each and offer first insights into the general benefits of peripheral interaction. In the following we carried out two studies comparing all three interaction styles (graspable, touch, freehand) for audio player control and for dealing with notifications. We found that all three interaction styles are generally fit for peripheral interaction but come with different advantages and disadvantages. The last set of explorative studies deals with the ability to recall spatial locations in 2D as well as 3D. The Unadorned Desk makes use of the physical space around the desktop computer and thereby offers an extended interaction space to store and retrieve virtual items such as commands, applications or tools. Finally, evaluation of peripheral interaction is not straightforward as the systems are designed to blend into the environment and not draw attention on them. We propose an additional evaluation method for the lab to complement the current evaluation practice in the field. The main contributions of this thesis are (1) an exhaustive classification and a more detailed look at manual peripheral interaction for tangible, touch and freehand interaction. Based on these exploration with all three interaction styles, we offer (2) implications in terms of overall benefits of peripheral interaction, learnability and habituation, visual and mental attention, feedback and handedness for future peripheral interaction design. Finally, derived from a diverse set of user studies, we assess (3) evaluation strategies enriching the design process for peripheral interaction.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
One of the most frequent ways to interact with the surrounding environment occurs as a visual way. Hence imaging is a very common way in order to gain information and learn from the environment. Particularly in the field of cellular biology, imaging is applied in order to get an insight into the minute world of cellular complexes. As a result, in recent years many researches have focused on developing new suitable image processing approaches which have facilitates the extraction of meaningful quantitative information from image data sets. In spite of recent progress, but due to the huge data set of acquired images and the demand for increasing precision, digital image processing and statistical analysis are gaining more and more importance in this field. There are still limitations in bioimaging techniques that are preventing sophisticated optical methods from reaching their full potential. For instance, in the 3D Electron Microscopy(3DEM) process nearly all acquired images require manual postprocessing to enhance the performance, which should be substitute by an automatic and reliable approach (dealt in Part I). Furthermore, the algorithms to localize individual fluorophores in 3D super-resolution microscopy data are still in their initial phase (discussed in Part II). In general, biologists currently lack automated and high throughput methods for quantitative global analysis of 3D gene structures. This thesis focuses mainly on microscopy imaging approaches based on Machine Learning, statistical analysis and image processing in order to cope and improve the task of quantitative analysis of huge image data. The main task consists of building a novel paradigm for microscopy imaging processes which is able to work in an automatic, accurate and reliable way. The specific contributions of this thesis can be summarized as follows: • Substitution of the time-consuming, subjective and laborious task of manual post-picking in Cryo-EM process by a fully automatic particle post-picking routine based on Machine Learning methods (Part I). • Quality enhancement of the 3D reconstruction image due to the high performance of automatically post-picking steps (Part I). • Developing a full automatic tool for detecting subcellular objects in multichannel 3D Fluorescence images (Part II). • Extension of known colocalization analysis by using spatial statistics in order to investigate the surrounding point distribution and enabling to analyze the colocalization in combination with statistical significance (Part II). All introduced approaches are implemented and provided as toolboxes which are free available for research purposes.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Ein heutiger Computerarbeitsplatz besteht normalerweise aus einer horizontalen Arbeitsfläche und mindestens einem vertikalen Bildschirm. Beide Orientierungen der Arbeitsbereiche haben Vorteile für einzelne Arbeitsschritte. Auf vertikalen Flächen ist beispielsweise das Lesen langer Texte ergonomischer, während das direkte Bearbeiten von Texten auf horizontalen Flächen weniger anstrengend ist. Der Wechsel zwischen den beiden Arbeitsbereichen ist jedoch umständlich, da die horizontale Arbeitsfläche häufig nicht digital ist. Doch selbst die steigende Verbreitung berührungsempfindlicher Bildschirme im horizontalen Arbeitsbereich (z.B. Tablets) löst dieses Problem nicht. Zwar bringen diese Geräte zum einen die Vorteile direkter Interaktion mit sich, führen aber zum anderen zur Frage, wie die digitalen Inhalte zwischen den unterschiedlich orientierten, digitalen Bereichen ausgetauscht werden. Eine Lösung hierfür ist die Kombination unterschiedlich orientierter Displays. Es gibt mehrere Ansätze diese zu kombinieren, jedoch sind die Displays dabei meistens physikalisch voneinander getrennt. Das führt dazu, dass der Nutzer die Displays zum einen eher als separate Einheiten wahrnimmt und zum anderen kein einfacher Übergang zwischen den Displays möglich ist. Eine Verbindungsart, die bis jetzt noch weitgehend unerforscht ist, ist die Kombination beider Displaybereiche durch eine gebogene Verbindung. Die Biegung stellt eine nahtlose Verbindung und einen unterbrechungsfreien Übergang zwischen den Displaybereichen her. Der Effekt eines solchen Übergangs auf die Nutzerinteraktion ist jedoch unbekannt. Die Biegung des Bildschirms eröffnet darüber hinaus auch die Möglichkeit für neuartige Visualisierungen, die von der nahtlosen Kombination unterschiedlicher Displayorientierungen profitieren. Außerdem können auch gewöhnliche, grafische Benutzerschnittstellen hinsichtlich der Displayform optimiert werden. Im Rahmen dieser Arbeit wird ein solches Display vorgestellt und dessen Effekte auf die Nutzerinteraktion und Potenziale für grafische Benutzerschnittstellen untersucht. Der Curve ist ein interaktives Display, das einen horizontalen und einen vertikalen Bereich durch eine nahtlose, gebogene Verbindung kombiniert. Im ersten Teil der Arbeit werden die Entwicklung der Displayform und die technische Umsetzung des Prototyps beschrieben. Anschließend wird im zweiten Teil der Einfluss der Displayform sowohl auf direkte als auch auf indirekte Interaktionsarten evaluiert. Außerdem wird der Curve um eine greifbare Benutzerschnittstelle erweitert und die Auswirkung der Displayform auf die Bedienbarkeit dieser Schnittstelle untersucht. Im dritten Teil werden zwei Visualisierungen und eine vorhandene, grafische Benutzerschnittstelle vorgestellt, die jeweils an die gebogene Displayform angepasst wurden. Die praktischen Erfahrungen aus den Entwicklungsprozessen werden dann in Form von Empfehlungen für vergleichbare Displayprojekte zusammengefasst. Am Ende der Arbeit stehen sowohl Ausgangspunkte für eine technische Weiterentwicklung, als auch weitere exemplarische Anwendungsszenarien, die von der gebogenen Displayform des Curve profitieren können.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Knowledge Discovery in Databases (KDD) ist der Prozess, nicht-triviale Muster aus großen Datenbanken zu extrahieren, mit dem Ziel, dass diese bisher unbekannt, potentiell nützlich, statistisch fundiert und verständlich sind. Der Prozess umfasst mehrere Schritte wie die Selektion, Vorverarbeitung, Evaluierung und den Analyseschritt, der als Data-Mining bekannt ist. Eine der zentralen Aufgabenstellungen im Data-Mining ist die Ausreißererkennung, das Identifizieren von Beobachtungen, die ungewöhnlich sind und mit der Mehrzahl der Daten inkonsistent erscheinen. Solche seltene Beobachtungen können verschiedene Ursachen haben: Messfehler, ungewöhnlich starke (aber dennoch genuine) Abweichungen, beschädigte oder auch manipulierte Daten. In den letzten Jahren wurden zahlreiche Verfahren zur Erkennung von Ausreißern vorgeschlagen, die sich oft nur geringfügig zu unterscheiden scheinen, aber in den Publikationen experimental als ``klar besser'' dargestellt sind. Ein Schwerpunkt dieser Arbeit ist es, die unterschiedlichen Verfahren zusammenzuführen und in einem gemeinsamen Formalismus zu modularisieren. Damit wird einerseits die Analyse der Unterschiede vereinfacht, andererseits aber die Flexibilität der Verfahren erhöht, indem man Module hinzufügen oder ersetzen und damit die Methode an geänderte Anforderungen und Datentypen anpassen kann. Um die Vorteile der modularisierten Struktur zu zeigen, werden (i) zahlreiche bestehende Algorithmen in dem Schema formalisiert, (ii) neue Module hinzugefügt, um die Robustheit, Effizienz, statistische Aussagekraft und Nutzbarkeit der Bewertungsfunktionen zu verbessern, mit denen die existierenden Methoden kombiniert werden können, (iii) Module modifiziert, um bestehende und neue Algorithmen auf andere, oft komplexere, Datentypen anzuwenden wie geographisch annotierte Daten, Zeitreihen und hochdimensionale Räume, (iv) mehrere Methoden in ein Verfahren kombiniert, um bessere Ergebnisse zu erzielen, (v) die Skalierbarkeit auf große Datenmengen durch approximative oder exakte Indizierung verbessert. Ausgangspunkt der Arbeit ist der Algorithmus Local Outlier Factor (LOF). Er wird zunächst mit kleinen Erweiterungen modifiziert, um die Robustheit und die Nutzbarkeit der Bewertung zu verbessern. Diese Methoden werden anschließend in einem gemeinsamen Rahmen zur Erkennung lokaler Ausreißer formalisiert, um die entsprechenden Vorteile auch in anderen Algorithmen nutzen zu können. Durch Abstraktion von einem einzelnen Vektorraum zu allgemeinen Datentypen können auch räumliche und zeitliche Beziehungen analysiert werden. Die Verwendung von Unterraum- und Korrelations-basierten Nachbarschaften ermöglicht dann, einen neue Arten von Ausreißern in beliebig orientierten Projektionen zu erkennen. Verbesserungen bei den Bewertungsfunktionen erlauben es, die Bewertung mit der statistischen Intuition einer Wahrscheinlichkeit zu interpretieren und nicht nur eine Ausreißer-Rangfolge zu erstellen wie zuvor. Verbesserte Modelle generieren auch Erklärungen, warum ein Objekt als Ausreißer bewertet wurde. Anschließend werden für verschiedene Module Verbesserungen eingeführt, die unter anderem ermöglichen, die Algorithmen auf wesentlich größere Datensätze anzuwenden -- in annähernd linearer statt in quadratischer Zeit --, indem man approximative Nachbarschaften bei geringem Verlust an Präzision und Effektivität erlaubt. Des weiteren wird gezeigt, wie mehrere solcher Algorithmen mit unterschiedlichen Intuitionen gleichzeitig benutzt und die Ergebnisse in einer Methode kombiniert werden können, die dadurch unterschiedliche Arten von Ausreißern erkennen kann. Schließlich werden für reale Datensätze neue Ausreißeralgorithmen konstruiert, die auf das spezifische Problem angepasst sind. Diese neuen Methoden erlauben es, so aufschlussreiche Ergebnisse zu erhalten, die mit den bestehenden Methoden nicht erreicht werden konnten. Da sie aus den Bausteinen der modularen Struktur entwickelt wurden, ist ein direkter Bezug zu den früheren Ansätzen gegeben. Durch Verwendung der Indexstrukturen können die Algorithmen selbst auf großen Datensätzen effizient ausgeführt werden.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Statistical methods usually require that the analyzed data are correct and precise observations of the variables of interest. In practice, however, often only incomplete or uncertain information about the quantities of interest is available. The question studied in the present thesis is, how a regression analysis can reasonably be performed when the variables are only imprecisely observed. At first, different approaches to analyzing imprecisely observed variables that were proposed in the Statistics literature are discussed. Then, a new likelihood-based methodology for regression analysis with imprecise data called Likelihood-based Imprecise Regression is introduced. The corresponding methodological framework is very broad and permits accounting for coarsening errors, in contrast to most alternative approaches to analyzing imprecise data. The methodology suggests considering as the result of a regression analysis the entire set of all regression functions that cannot be excluded in the light of the data, which can be interpreted as a confidence set. In the subsequent chapter, a very general regression method is derived from the likelihood-based methodology. This regression method does not impose restrictive assumptions about the form of the imprecise observations, about the underlying probability distribution, and about the shape of the relationship between the variables. Moreover, an exact algorithm is developed for the special case of simple linear regression with interval data and selected statistical properties of this regression method are studied. The proposed regression method turns out to be robust in terms of a high breakdown point and to provide very reliable insights in the sense of a set-valued result with a high coverage probability. In addition, an alternative approach proposed in the literature based on Support Vector Regression is studied in detail and generalized by embedding it into the framework of the formerly introduced likelihood-based methodology. In the end, the discussed regression methods are applied to two practical questions.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Online Soziale Netzwerke (OSNs) repräsentieren das vorherrschende Medium zur computergestützten Kommunikation und Verbreitung persönlicher, geschäftlicher oder auch wissenschaftlicher Inhalte. Eine Reihe von Vorkommnissen in der jüngsten Vergangenheit hat gezeigt, dass die Bereitstellung privater Informationen in OSNs mit erheblichen Risiken für die Sicherheit und den Schutz der Privatsphäre seiner Nutzer verbunden ist. Gleiches gilt für die Bereiche Wirtschaft und Wissenschaft. Ursächlich dafür ist die zentralisierte Verwaltung der Nutzer und ihrer publizierten Inhalte unter einer singulären administrativen Domäne. Mit Vegas präsentiert der erste Teil dieser Arbeit ein dezentrales OSN, das mit seiner restriktiven Sicherheitsarchitektur diesem Problem begegnet. Oberstes Ziel ist die technische Umsetzung des Rechts auf informationelle Selbstbestimmung. Dazu schränkt Vegas den Zugriff auf den sozialen Graphen und jeglichen Informationsaustausch auf die Nutzer des eigenen Egonetzwerks ein. Neben der Möglichkeit zur Kommunikation und der Bereitstellung persönlicher Informationen erlauben einige OSNs auch das Browsen des sozialen Graphen und die Suche nach Inhalten anderer Nutzer. Um auch in sicheren und die Privatsphäre schützenden OSNs wie Vegas vom akkumulierten Wissen des sozialen Graphen zu profitieren, beschäftigt sich der zweite Teil dieser Arbeit mit der Entwicklung und Analyse intelligenter Priorisierungsstrategien zur Weiterleitung von Suchanfragen innerhalb dezentraler OSNs. Im Kontext von OSNs werden neue Algorithmen und Protokolle zunächst simulativ evaluiert. Die Grundlage bildet in der Regel der Crawling-Datensatz eines OSNs. Offensichtlich ist das Crawling in sicheren und die Privatsphäre schützenden dezentralen OSNs wie Vegas nicht möglich. Um diesem Problem zu begegnen, beschäftigt sich der dritte Teil dieser Arbeit mit der Entwicklung eines generischen Modells zur künstlichen Erzeugung sozialer Interaktionsgraphen. Neben den strukturellen Besonderheiten zentralisierter und dezentraler Systeme wird erstmals auch das Interaktionsverhalten der Nutzer eines OSNs modelliert. Die Eignung des Modells wird auf der Grundlage gecrawlter sozialer Graphen evaluiert.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 02/02
Various experiences in modern life are in one or another form connected to car rides. However, the automotive industry so far only regards driving as the only relevant experience, a perspective which consequently dominates the field of interaction design for vehicles. Yet, the car is an exceptionally potential space for experiences that go beyond the actual driving task, such as intensive conversation or exploration. So how is it possible to design for special situations like those, and thereby create positive emotions for drivers and passengers alike? To meet this objective it is necessary to use the human, instead of the function, as the starting point. The design approach of "Experience Design" defined by Marc Hassenzahl provides exactly this focus on the human and concentrates first on their experience. Here, positive emotions are specifically created through the fulfilling of psychological needs. Experience Design enables the precise analysis of experiences in advance of the design by clarifying with the help of psychological needs why a considered experience is viewed as positive. Furthermore, Experience Design supports the composition of an Experience Story, which is attuned to the desired psychological needs and which defines the experience to be designed. This experience can then gradually be translated into an interaction design. Finally, with the help of technology, the created experience can be lived through in situ by participants and later analysed. Based upon this design approach and by means of methods drawn from the fields of human machine-interaction as well as psychology, four studies on the design of experiences through interaction products in the automotive domain are presented. The created experiences are divided into "Experiences at Group Drives in the Car for Pleasure" and "Experiences While Commuting Alone". These experiences take place in different scenarios, namely: in a motorcade, an exploratory cruise, a commuting ride, and while driving considerately. Out of the practical application of Experience Design in these studies a best practice for the use of the employed methods is developed. Thereby, this work brings to light the possibilities for using technology to design experiences that go beyond the mere act of driving. Furthermore, the challenges of designing experiences in usability-focused environments are shown. Thus, this work is aimed at offering inspiration to designers and researchers particularly in the automotive domain for designing experiences and thereby furthering innovation.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Dienstleister aus dem Bereich der Informationstechnologie (IT) stehen vor der großen Herausforderung, immer komplexere IT-Dienste kostengünstig anzubieten und diese effizient zu betreiben. Um dies zu erzielen, führt die Disziplin des IT-Service-Management (ITSM) strukturierte Managementprozesse ein. Werkzeuge unterstützen diese und stellen eine wichtige Schnittstelle zwischen Mensch, Prozess und Technik dar. Mit diesen Werk- zeugen lassen sich Prozesse koordinieren, die Technik effizient verwalten und wichtige Informationen für den Betrieb zusammenzuführen. Der geeignete Einsatz von Werkzeugen ist eine wesentliche Voraussetzung, um komplexe Aufgaben mit möglichst geringem Aufwand durchzuführen. Effizientes ITSM verfolgt somit auch stets das Ziel, Werkzeuge optimal einzusetzen und die ITSM-Prozesse sinnvoll zu unterstützen. Im Rahmen der Arbeit wird ein Ansatz vorgestellt, um den Einsatz von Werkzeugen entsprechend zu optimieren. Kern des Lösungsansatzes ist die Definition eines Reifegradmodells für Werkzeuglandschaften. Mit diesem lassen sich Werkzeuglandschaften begutachten und die Unterstützung der ITSM-Prozesse systematisch bewerten. Das Resultat ist eine gewichtete Liste mit Anforderungen an die Werkzeuglandschaft, um eine möglichst gute Prozessunterstützung zu erreichen. Aufgrund der Priorisierung der Anforderungen ist ein IT-Dienstleister nicht gezwungen, die Werkzeuglandschaft komplett in einem großen Schritt anzupassen. Stattdessen können die Verbesserungen sukzessive vorgenommen werden. Das Reifegradmodell unterstützt systematisch dabei, zunächst die wichtigsten Anforderungen umzusetzen, so dass die ITSM-Prozesse effektiv arbeiten können. Die Steigerung der Effizienz erfolgt dann in weiteren Schritten, indem zusätzliche Anforderungen umgesetzt werden. Die Erstellung eines solchen Reifegradmodells wird im Folgenden beschrieben. Zunächst wurden Anforderungen an einen geeigneten Lösungsansatz analysiert und ein Konzept für ein Reifegradmodell erarbeitet. Darauf aufbauend ist dieses Konzept beispielhaft angewendet worden, um ein Reifegradmodell für Werkzeuglandschaften zur Unterstützung von Prozessen nach ISO/IEC 20000 zu entwickeln. Die Arbeit schließt mit einer Evaluation des Lösungsansatzes ab, wobei das entwickelte Reifegradmodell empirisch in einem Szenario eines IT-Dienstleisters angewendet wurde. Mit der vorliegenden Arbeit wird die Grundlage für ein ganzheitliches und integriertes Management der Werkzeuglandschaft von IT-Dienstleistern geschaffen. Künftige Arbeiten können diese Methodik für spezifische Anwendungsszenarien übernehmen. Langfristig soll diese Arbeit als Grundlage dienen, um ein standardisiertes Reifegradmodell für Werkzeuglandschaften im Kontext von ITSM zu etablieren.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Both the current trends in technology such as smart phones, general mobile devices, stationary sensors and satellites as well as a new user mentality of utilizing this technology to voluntarily share information produce a huge flood of geo-spatial and geo-spatio-temporal data. This data flood provides a tremendous potential of discovering new and possibly useful knowledge. In addition to the fact that measurements are imprecise, due to the physical limitation of the devices, some form of interpolation is needed in-between discrete time instances. From a complementary perspective - to reduce the communication and bandwidth utilization, along with the storage requirements, often the data is subjected to a reduction, thereby eliminating some of the known/recorded values. These issues introduce the notion of uncertainty in the context of spatio-temporal data management - an aspect raising an imminent need for scalable and flexible data management. The main scope of this thesis is to develop effective and efficient techniques for similarity search and data mining in uncertain spatial and spatio-temporal data. In a plethora of research fields and industrial applications, these techniques can substantially improve decision making, minimize risk and unearth valuable insights that would otherwise remain hidden. The challenge of effectiveness in uncertain data is to correctly determine the set of possible results, each associated with the correct probability of being a result, in order to give a user a confidence about the returned results. The contrary challenge of efficiency, is to compute these result and corresponding probabilities in an efficient manner, allowing for reasonable querying and mining times, even for large uncertain databases. The paradigm used to master both challenges, is to identify a small set of equivalent classes of possible worlds, such that members of the same class can be treated as equivalent in the context of a given query predicate or data mining task. In the scope of this work, this paradigm will be formally defined, and applied to the most prominent classes of spatial queries on uncertain data, including range queries, k-nearest neighbor queries, ranking queries and reverse k-nearest neighbor queries. For this purpose, new spatial and probabilistic pruning approaches are developed to further speed up query processing. Furthermore, the proposed paradigm allows to develop the first efficient solution for the problem of frequent co-location mining on uncertain data. Special emphasis is taken on the temporal aspect of applications using modern data collection technologies. While the aforementioned techniques work well for single points of time, the prediction of query results over time remains a challenge. This thesis fills this gap by modeling an uncertain spatio-temporal object as a stochastic process, and by applying the above paradigm to efficiently query, index and mine historical spatio-temporal data.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Relational learning is concerned with learning from data where information is primarily represented in form of relations between entities. In recent years, this branch of machine learning has become increasingly important, as relational data is generated in an unprecedented amount and has become ubiquitous in many fields of application such as bioinformatics, artificial intelligence and social network analysis. However, relational learning is a very challenging task, due to the network structure and the high dimensionality of relational data. In this thesis we propose that tensor factorization can be the basis for scalable solutions for learning from relational data and present novel tensor factorization algorithms that are particularly suited for this task. In the first part of the thesis, we present the RESCAL model -- a novel tensor factorization for relational learning -- and discuss its capabilities for exploiting the idiosyncratic properties of relational data. In particular, we show that, unlike existing tensor factorizations, our proposed method is capable of exploiting contextual information that is more distant in the relational graph. Furthermore, we present an efficient algorithm for computing the factorization. We show that our method achieves better or on-par results on common benchmark data sets, when compared to current state-of-the-art relational learning methods, while being significantly faster to compute. In the second part of the thesis, we focus on large-scale relational learning and its applications to Linked Data. By exploiting the inherent sparsity of relational data, an efficient computation of RESCAL can scale up to the size of large knowledge bases, consisting of millions of entities, hundreds of relations and billions of known facts. We show this analytically via a thorough analysis of the runtime and memory complexity of the algorithm as well as experimentally via the factorization of the YAGO2 core ontology and the prediction of relationships in this large knowledge base on a single desktop computer. Furthermore, we derive a new procedure to reduce the runtime complexity for regularized factorizations from O(r^5) to O(r^3) -- where r denotes the number of latent components of the factorization -- by exploiting special properties of the factorization. We also present an efficient method for including attributes of entities in the factorization through a novel coupled tensor-matrix factorization. Experimentally, we show that RESCAL allows us to approach several relational learning tasks that are important to Linked Data. In the third part of this thesis, we focus on the theoretical analysis of learning with tensor factorizations. Although tensor factorizations have become increasingly popular for solving machine learning tasks on various forms of structured data, there exist only very few theoretical results on the generalization abilities of these methods. Here, we present the first known generalization error bounds for tensor factorizations. To derive these bounds, we extend known bounds for matrix factorizations to the tensor case. Furthermore, we analyze how these bounds behave for learning on over- and understructured representations, for instance, when matrix factorizations are applied to tensor data. In the course of deriving generalization bounds, we also discuss the tensor product as a principled way to represent structured data in vector spaces for machine learning tasks. In addition, we evaluate our theoretical discussion with experiments on synthetic data, which support our analysis.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Thu, 25 Jul 2013 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/16028/ https://edoc.ub.uni-muenchen.de/16028/13/Emrich_Tobias.pdf Emrich, Tobias ddc:004, ddc:000, Fakultät für Mathematik, Inform
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Mit dem Aufkommen und der steigenden Verbreitung von Smartphones, haben ortsbezogene Dienste einen festen Platz im täglichen Leben vieler Nutzer erhalten. Dabei werden auf Basis des Aufenthaltsortes gezielt Informationen gefiltert, Umgebungsinformationen verfügbar gemacht oder Suchergebnisse nach Lokalität bewertet. Zudem werden bestimmte Dienste, wie mobile Routenfindung und Navigation, ermöglicht. Viele Dienste beziehen nicht nur die Position eines Nutzers mit ein, sondern erlauben es, die Position von Freunden anzuzeigen oder automatische Benachrichtigungen beim Betreten bestimmter Regionen zu erzeugen. Erfordert ein ortsbezogener Dienst eine hohe Positionsgenauigkeit, so wird die Position globale Satellitennavigationssysteme bestimmt. Auch in großen komplexen Gebäuden, wie Museen, Flughäfen oder Krankenhäusern, besteht Bedarf an ortsbezogenen Informationen. Beispiele hierfür sind die Suche nach einem speziellen Ausstellungsstück im Museum, die Navigation zum richtigen Gate am Flughafen oder das Treffen mit einem Freund im selben Gebäude. Solche ortsbezogene Dienste in Gebäuden werden im folgenden auch mit dem englischen Begriff Indoor-Location Based Services (I-LBS) bezeichnet. Sie vereinfachen in vielen Situationen unser Leben und werden zukünftig eine ähnliche Verbreitung wie herkömmliche ortbezogene Dienste erlangen. Derzeit existiert jedoch keine Lösung, die I-LBS flächendeckend ermöglicht. Dazu gibt es vor allem zwei Gründe: Zum einen gibt es im Gegensatz zu Außenbereichen keine allgemein verfügbare Kartenbasis. Die Baupläne sind oftmals unter Verschluss und eignen sich mehr für die Planung und Überwachung von Baumaßnahmen als für den semantischen Informationsgewinn. Zum anderen ist der Empfang von Satellitensignalen in Gebäuden so schlecht, dass damit im allgemeinen keine genügend genaue Position bestimmt werden kann. Eine alternative kostengünstige und überall verfügbare Positionsbestimmung von genügend hoher Genauigkeit existiert derzeit nicht. In dieser Arbeit werden Lösungsmöglichkeiten für beide Probleme vorgestellt und evaluiert, die einem Nutzer eine vergleichbare Dienstnutzung erlauben sollen, wie er es in Außenbereichen bereits gewöhnt ist. Anhand der Anforderungen von I-LBS und Ortungssystemen werden zwei verschiedene Umgebungsmodelle entwickelt. Eines basiert auf der Geography Markup Language (GML) und bietet eine flexible Vektor-basierte Repräsentation eines Gebäudes mit hierarchischen und Graph-basierten Elementen. Zudem wird die vollautomatische Erzeugung eines solchen Modells aus Bauplänen vorgestellt, die einen weiteren Schritt zur flächendeckenden Bereitstellung von Plänen für I-LBS darstellt. Das andere Modell basiert auf einer Bitmap als Raster-basierter Kartendarstellung, welche mithilfe von Bildbearbeitungsalgorithmen und Konventionen in der Farbgebung semantisch angereichert wird. Auch hier werden Möglichkeiten zur automatischen Erzeugung des semantischen Modells, beispielsweise aus abfotografierten Fluchtplänen, erörtert. In einem letzten Schritt werden beide Modelle in einem flexiblen hybriden Umgebungsmodell kombiniert, um Anfragen je nach Datenbasis möglichst effizient beantworten zu können. Die Positionsbestimmung in Gebäuden wird anhand von einigen Verbesserungen für Fingerprinting-Ansätze auf Smartphones behandelt. Das Fingerprinting basiert dabei entweder auf Kamerabildern oder auf WLAN-Signalen. Zudem werden zusätzliche Sensoren, wie Kompass und Beschleunigungssensor, zur Verbesserung der Genauigkeit und Robustheit hinzugenommen. Um die Positionsbestimmung für den Einsatz in I-LBS verfügbar zu machen, ist jedoch nicht nur eine hohe Genauigkeit, sondern vor allem eine große Flexibilität die Hauptanforderung. Zu diesem Zweck wurde ein Ansatz entwickelt, welcher ohne Nutzerinteraktion allein auf Basis von Kartenmaterial und inertialen Sensoren ein oder mehrerer Nutzer eine Fingerprint-Datenbank erzeugt, welche anderen Nutzern zur Verfügung gestellt werden kann. Mit dem Ziel der Kosten- und Komplexitätsreduktion, sowie der Lösung des Problems der Aktualität von Daten in Fingerprint-Datenbanken, hilft der Ansatz bei der automatischen flächendeckenden Ausbringung von Referenzdaten zur Positionsbestimmung. Um die Brücke zwischen I-LBS und LBS zu schlagen, reicht es allerdings nicht aus, beide Arten von Diensten getrennt zu betrachten. Eine nahtlose Dienstnutzung muss möglich sein und somit werden sowohl eine nahtlose Positionsbestimmung, als auch eine nahtlose Bereitstellung von Kartenmaterial notwendig. Zu diesem Zweck wurde ein Plattform entwickelt, welche auf Basis einer Sensorbeschreibungssprache automatisch die Auswahl und Kombination der zu nutzenden Sensoren zur Positionsbestimmung ermittelt. Zudem verfügt die Plattform über eine Komponente, die auf Basis der Positionsdaten passende Umgebungsmodelle zur Verfügung stellt und die Transformation von Positionsdaten zwischen verschiedenen Modellen ermöglicht.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The goal of bioinformatics is to develop innovative and practical methods and algorithms for bio- logical questions. In many cases, these questions are driven by new biotechnological techniques, especially by genome and cell wide high throughput experiment studies. In principle there are two approaches: 1. Reduction and abstraction of the question to a clearly defined optimization problem, which can be solved with appropriate and efficient algorithms. 2. Development of context based methods, incorporating as much contextual knowledge as possible in the algorithms, and derivation of practical solutions for relevant biological ques- tions on the high-throughput data. These methods can be often supported by appropriate software tools and visualizations, allowing for interactive evaluation of the results by ex- perts. Context based methods are often much more complex and require more involved algorithmic techniques to get practical relevant and efficient solutions for real world problems, as in many cases already the simplified abstraction of problems result in NP-hard problem instances. In many cases, to solve these complex problems, one needs to employ efficient data structures and heuristic search methods to solve clearly defined sub-problems using efficient (polynomial) op- timization (such as dynamic programming, greedy, path- or tree-algorithms). In this thesis, we present new methods and analyses addressing open questions of bioinformatics from different contexts by incorporating the corresponding contextual knowledge. The two main contexts in this thesis are the protein structure similarity context (Part I) and net- work based interpretation of high-throughput data (Part II). For the protein structure similarity context Part I we analyze the consistency of gold standard structure classification systems and derive a consistent benchmark set usable for different ap- plications. We introduce two methods (Vorolign, PPM) for the protein structure similarity recog- nition problem, based on different features of the structures. Derived from the idea and results of Vorolign, we introduce the concept of contact neighbor- hood potential, aiming to improve the results of protein fold recognition and threading. For the re-scoring problem of predicted structure models we introduce the method Vorescore, clearly improving the fold-recognition performance, and enabling the evaluation of the contact neighborhood potential for structure prediction methods in general. We introduce a contact consistent Vorolign variant ccVorolign further improving the structure based fold recognition performance, and enabling direct optimization of the neighborhood po- tential in the future. Due to the enforcement of contact-consistence, the ccVorolign method has much higher computational complexity than the polynomial Vorolign method - the cost of com- puting interpretable and consistent alignments. Finally, we introduce a novel structural alignment method (PPM) enabling the explicit modeling and handling of phenotypic plasticity in protein structures. We employ PPM for the analysis of effects of alternative splicing on protein structures. With the help of PPM we test the hypothesis, whether splice isoforms of the same protein can lead to protein structures with different folds (fold transitions). In Part II of the thesis we present methods generating and using context information for the interpretation of high-throughput experiments. For the generation of context information of molecular regulations we introduce novel textmin- ing approaches extracting relations automatically from scientific publications. In addition to the fast NER (named entity recognition) method (syngrep) we also present a novel, fully ontology-based context-sensitive method (SynTree) allowing for the context-specific dis- ambiguation of ambiguous synonyms and resulting in much better identification performance. This context information is important for the interpretation of high-throughput data, but often missing in current databases. Despite all improvements, the results of automated text-mining methods are error prone. The RelAnn application presented in this thesis helps to curate the automatically extracted regula- tions enabling manual and ontology based curation and annotation. For the usage of high-throughput data one needs additional methods for data processing, for example methods to map the hundreds of millions short DNA/RNA fragments (so called reads) on a reference genome or transcriptome. Such data (RNA-seq reads) are the output of next generation sequencing methods measured by sequencing machines, which are becoming more and more efficient and affordable. Other than current state-of-the-art methods, our novel read-mapping method ContextMap re- solves the occurring ambiguities at the final step of the mapping process, employing thereby the knowledge of the complete set of possible ambiguous mappings. This approach allows for higher precision, even if more nucleotide errors are tolerated in the read mappings in the first step. The consistence between context information of molecular regulations stored in databases and extracted from textmining against measured data can be used to identify and score consistent reg- ulations (GGEA). This method substantially extends the commonly used gene-set based methods such over-representation (ORA) and gene set enrichment analysis (GSEA). Finally we introduce the novel method RelExplain, which uses the extracted contextual knowl- edge and generates network-based and testable hypotheses for the interpretation of high-throughput data.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Direct touch input on interactive surfaces has become a predominating standard for the manipulation of digital information in our everyday lives. However, compared to our rich interchange with the physical world, the interaction with touch-based systems is limited in terms of flexibility of input and expressiveness of output. Particularly, the lack of tactile feedback greatly reduces the general usability of a touch-based system and hinders from a productive entanglement of the virtual information with the physical world. This thesis proposes remote tactile feedback as a novel method to provide programmed tactile stimuli supporting direct touch interactions. The overall principle is to spatially decouple the location of touch input (e.g. fingertip or hand) and the location of the tactile sensation on the user's body (e.g. forearm or back). Remote tactile feedback is an alternative concept which avoids particular challenges of existing approaches. Moreover, the principle provides inherent characteristics which can accommodate for the requirements of current and future touch interfaces. To define the design space, the thesis provides a structured overview of current forms of touch surfaces and identifies trends towards non-planar and non-rigid forms with more versatile input mechanisms. Furthermore, a classification highlights limitations of the current methods to generate tactile feedback on touch-based systems. The proposed notion of tactile sensory relocation is a form of sensory substitution. Underlying neurological and psychological principles corroborate the approach. Thus, characteristics of the human sense of touch and principles from sensory substitution help to create a technical and conceptual framework for remote tactile feedback. Three consecutive user studies measure and compare the effects of both direct and remote tactile feedback on the performance and the subjective ratings of the user. Furthermore, the experiments investigate different body locations for the application of tactile stimuli. The results show high subjective preferences for tactile feedback, regardless of its type of application. Additionally, the data reveals no significant differences between the effects of direct and remote stimuli. The results back the feasibility of the approach and provide parameters for the design of stimuli and the effective use of the concept. The main part of the thesis describes the systematical exploration and analysis of the inherent characteristics of remote tactile feedback. Four specific features of the principle are identified: (1) the simplification of the integration of cutaneous stimuli, (2) the transmission of proactive, reactive and detached feedback, (3) the increased expressiveness of tactile sensations and (4) the provision of tactile feedback during multi-touch. In each class, several prototypical remote tactile interfaces are used in evaluations to analyze the concept. For example, the PhantomStation utilizes psychophysical phenomena to reduce the number of single tactile actuators. An evaluation with the prototype compares standard actuator technologies with each other in order to enable simple and scalable implementations. The ThermalTouch prototype creates remote thermal stimuli to reproduce material characteristics on standard touchscreens. The results show a stable rate of virtual object discrimination based on remotely applied temperature profiles. The AutmotiveRTF system is implemented in a vehicle and supports the driver's input on the in-vehicle-infotainment system. A field study with the system focuses on evaluating the effects of proactive and reactive feedback on the user's performance. The main contributions of the dissertation are: First, the thesis introduces the principle of remote tactile feedback and defines a design space for this approach as an alternative method to provide non-visual cues on interactive surfaces. Second, the thesis describes technical examples to rapidly prototype remote tactile feedback systems. Third, these prototypes are deployed in several evaluations which highlight the beneficial subjective and objective effects of the approach. Finally, the thesis presents features and inherent characteristics of remote tactile feedback as a means to support the interaction on today's touchscreens and future interactive surfaces.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Fahrrelevante und unterhaltungsbezogene Informationen werden, historisch betrachtet, räumlich getrennt im Fahrzeuginnenraum angeordnet: Für die Fahraufgabe notwendige Anzeigen befinden sich direkt vor dem Fahrer (Kombiinstrument und Head-Up Display) und Inhalte des Fahrerinformationssystems in der Mittelkonsole (zentrales Informationsdisplay). Aktuell ist eine Auflösung dieser strikten Trennung zu beobachten. Beispielsweise werden im Kombiinstrument Teilumfänge der Infotainmentinhalte abgerufen und bedient. Um dem Fahrer einen sicheren Umgang mit den zunehmenden Infotainmentinhalten zu ermöglichen, die Komplexität des Fahrerinteraktionsraumes zu reduzieren und den Kundennutzen zu steigern, betrachtet die vorliegende Arbeit die derzeit isolierten Displays ganzheitlich und lotet die Grenzen der momentan strikten Informationsverteilung neu aus. Es werden Grundlagen für die verkehrsgerechte Bedienung und Darstellung verteilter Informationen abhängig von deren Anzeigefläche gelegt, Konzepte zur nutzerinitiierten Individualisierung entwickelt und das Zusammenspiel von unterschiedlichen Anzeigeflächen evaluiert. Die in dieser Arbeit durchgeführten Studien zeigen, dass der räumlich verteilte Fahrerinteraktionsraum die Bedienung des Fahrerinformationssystems für den Nutzer sicherer und attraktiver gestaltet.
Fakultät für Geschichts- und Kunstwissenschaften - Digitale Hochschulschriften der LMU
Die Dissertation analysiert und vergleicht das Wirtschaftsschriftgut des Dominikanerinnenkonvents Altenhohenau und des Benediktinerinnenklosters Neuburg an der Donau nach funktionalen Gesichtspunkten aus und geht der Frage nach, wie die im Spätmittelalter reformierten Konventualinnen aus der Klausur heraus die Wirtschaftsführung gestaltet haben.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Many real-world application domains such as sensor-monitoring systems for environmental research or medical diagnostic systems are dealing with data that is represented by multiple observations. In contrast to single-observation data, where each object is assigned to exactly one occurrence, multi-observation data is based on several occurrences that are subject to two key properties: temporal variability and uncertainty. When defining similarity between data objects, these properties play a significant role. In general, methods designed for single-observation data hardly apply for multi-observation data, as they are either not supported by the data models or do not provide sufficiently efficient or effective solutions. Prominent directions incorporating the key properties are the fields of time series, where data is created by temporally successive observations, and uncertain data, where observations are mutually exclusive. This thesis provides research contributions for similarity processing - similarity search and data mining - on time series and uncertain data. The first part of this thesis focuses on similarity processing in time series databases. A variety of similarity measures have recently been proposed that support similarity processing w.r.t. various aspects. In particular, this part deals with time series that consist of periodic occurrences of patterns. Examining an application scenario from the medical domain, a solution for activity recognition is presented. Finally, the extraction of feature vectors allows the application of spatial index structures, which support the acceleration of search and mining tasks resulting in a significant efficiency gain. As feature vectors are potentially of high dimensionality, this part introduces indexing approaches for the high-dimensional space for the full-dimensional case as well as for arbitrary subspaces. The second part of this thesis focuses on similarity processing in probabilistic databases. The presence of uncertainty is inherent in many applications dealing with data collected by sensing devices. Often, the collected information is noisy or incomplete due to measurement or transmission errors. Furthermore, data may be rendered uncertain due to privacy-preserving issues with the presence of confidential information. This creates a number of challenges in terms of effectively and efficiently querying and mining uncertain data. Existing work in this field either neglects the presence of dependencies or provides only approximate results while applying methods designed for certain data. Other approaches dealing with uncertain data are not able to provide efficient solutions. This part presents query processing approaches that outperform existing solutions of probabilistic similarity ranking. This part finally leads to the application of the introduced techniques to data mining tasks, such as the prominent problem of probabilistic frequent itemset mining.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
In using the term 'hybrid interactions', we refer to interaction forms that comprise both tangible and intangible interactions as well as a close coupling of the physical or embodied representation with digital output. Until now, there has been no description of a formal design process for this emerging research domain, no description that can be followed during the creation of these types of interactions. As a result, designers face limitations in prototyping these systems. In this thesis, we share our systematic approach to envisioning, prototyping, and iteratively developing these interaction forms by following an extended interaction design process. We share our experiences with process extensions in the form of toolkits, which we built for this research and utilized to aid designers in the development of hybrid interactive systems. The proposed tools incorporate different characteristics and are intended to be used at different points in the design process. In Sketching with Objects, we describe a low-fdelity toolkit that is intended to be used in the very early phases of the process, such as ideation and user research. By introducing Paperbox, we present an implementation to be used in the mid-process phases for fnding the appropriate mapping between physical representation and digital content during the creation of tangible user interfaces (TUI) atop interactive surfaces. In a follow-up project, we extended this toolkit to also be used in conjunction with capacitive sensing devices. To do this, we implemented Sketch-a-TUI. This approach allows designers to create TUIs on capacitive sensing devices rapidly and at low cost. To lower the barriers for designers using the toolkit, we created the Sketch-a-TUIApp, an application that allows even novice users (users without previous coding experience) to create early instantiations of TUIs. In order to prototype intangible interactions, we used open soft- and hardware components and proposed an approach of investigating interactivity in correlation with intangible interaction forms on a higher fdelity. With our fnal design process extension, Lightbox, we assisted a design team in systematically developing a remote interaction system connected to a media façade covering a building. All of the above-mentioned toolkits were explored both in real-life contexts and in projects with industrial partners. The evaluation was therefore mainly performed in the wild, which led to the adaptation of metrics suitable to the individual cases and contexts.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The purpose of brain mapping is to advance the understanding of the relationship between structure and function in the human brain. Several techniques---with different advantages and disadvantages---exist for recording neural activity. Functional magnetic resonance imaging (fMRI) has a high spatial resolution, but low temporal resolution. It also suffers from a low-signal-to-noise ratio in event-related experimental designs, which are commonly used to investigate neuronal brain activity. On the other hand, the high temporal resolution of electroencephalography (EEG) recordings allows to capture provoked event-related potentials. Though, 3D maps derived by EEG source reconstruction methods have a low spatial resolution, they provide complementary information about the location of neuronal activity. There is a strong interest in combining data from both modalities to gain a deeper knowledge of brain functioning through advanced statistical modeling. In this thesis, a new Bayesian method is proposed for enhancing fMRI activation detection by the use of EEG-based spatial prior information in stimulus based experimental paradigms. This method builds upon a newly developed mere fMRI activation detection method. In general, activation detection corresponds to stimulus predictor components having an effect on the fMRI signal trajectory in a voxelwise linear model. We model and analyze stimulus influence by a spatial Bayesian variable selection scheme, and extend existing high-dimensional regression methods by incorporating prior information on binary selection indicators via a latent probit regression. For mere fMRI activation detection, the predictor consists of a spatially-varying intercept only. For EEG-enhanced schemes, an EEG effect is added, which is either chosen to be spatially-varying or constant. Spatially-varying effects are regularized by different Markov random field priors. Statistical inference in resulting high-dimensional hierarchical models becomes rather challenging from a modeling perspective as well as with regard to numerical issues. In this thesis, inference is based on a Markov Chain Monte Carlo (MCMC) approach relying on global updates of effect maps. Additionally, a faster algorithm is developed based on single-site updates to circumvent the computationally intensive, high-dimensional, sparse Cholesky decompositions. The proposed algorithms are examined in both simulation studies and real-world applications. Performance is evaluated in terms of convergency properties, the ability to produce interpretable results, and the sensitivity and specificity of corresponding activation classification rules. The main question is whether the use of EEG information can increase the power of fMRI models to detect activated voxels. In summary, the new algorithms show a substantial increase in sensitivity compared to existing fMRI activation detection methods like classical SPM. Carefully selected EEG-prior information additionally increases sensitivity in activation regions that have been distorted by a low signal-to-noise ratio.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Fri, 26 Oct 2012 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/15038/ https://edoc.ub.uni-muenchen.de/15038/1/bauer_sebastian.pdf Bauer, Sebastian ddc:004, ddc:000, Fakultät für Mathematik, Informatik und Statistik
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Random Forests are widely used for data prediction and interpretation purposes. They show many appealing characteristics, such as the ability to deal with high dimensional data, complex interactions and correlations. Furthermore, missing values can easily be processed by the built-in procedure of surrogate splits. However, there is only little knowledge about the properties of recursive partitioning in missing data situations. Therefore, extensive simulation studies and empirical evaluations have been conducted to gain deeper insight. In addition, new methods have been developed to enhance methodology and solve current issues of data interpretation, prediction and variable selection. A variable’s relevance in a Random Forest can be assessed by means of importance measures. Unfortunately, existing methods cannot be applied when the data contain miss- ing values. Thus, one of the most appreciated properties of Random Forests – its ability to handle missing values – gets lost for the computation of such measures. This work presents a new approach that is designed to deal with missing values in an intuitive and straightforward way, yet retains widely appreciated qualities of existing methods. Results indicate that it meets sensible requirements and shows good variable ranking properties. Random Forests provide variable selection that is usually based on importance mea- sures. An extensive review of corresponding literature led to the development of a new approach that is based on a profound theoretical framework and meets important statis- tical properties. A comparison to another eight popular methods showed that it controls the test-wise and family-wise error rate, provides a higher power to distinguish relevant from non-relevant variables and leads to models located among the best performing ones. Alternative ways to handle missing values are the application of imputation methods and complete case analysis. Yet it is unknown to what extent these approaches are able to provide sensible variable rankings and meaningful variable selections. Investigations showed that complete case analysis leads to inaccurate variable selection as it may in- appropriately penalize the importance of fully observed variables. By contrast, the new importance measure decreases for variables with missing values and therefore causes se- lections that accurately reflect the information given in actual data situations. Multiple imputation leads to an assessment of a variable’s importance and to selection frequencies that would be expected for data that was completely observed. In several performance evaluations the best prediction accuracy emerged from multiple imputation, closely fol- lowed by the application of surrogate splits. Complete case analysis clearly performed worst.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
As software systems rise in size and complexity, the need for verifying some of their properties increases. One important property to be verified is the resource usage, i.e. how many resources the program will need for its execution, where resources include execution time, memory, power, etc. Resource usage analysis is important in many areas, in particular embedded systems and cloud computing. Thus, resource analysis has been widely researched and some different approaches to this have been proposed based in particular on recurrence solving, abstract interpretation and amortised analysis. In the amortised analysis technique, a nonnegative number, called potential, is assigned to a data structure. The amortised cost of operations is then defined by its actual cost plus the difference in potential of the data structure before and after performing the operation. Amortised analysis has been used for automatic resource analysis of functional and object-oriented programs. The potentials are defined using refined types and typing rules then ensure that potential and actual resource usage is accounted for correctly. The automatic inference of the potential functions can then be achieved by type inference. In the case of functional programs, the structure of the types is known. Thus, type inference can be reduced to solving linear arithmetic constraints. For object-oriented programs, however, the refined types are more complicated because of the general nature of objects: they can be used to define any data structure. Thus, the type inference must discover not only the potential functions for the data structure but also the data structures themselves. Other features of object-oriented programs that complicate the analysis are aliasing and imperative update. Hofmann and Jost presented in 2006 a type system for amortised heap-space analysis of object-oriented programs, called Resource Aware JAva (RAJA). However, they left the problem of type inference open. In this thesis we present a type inference algorithm for the RAJA system. We were able to reduce the type inference problem to the novel problem of satisfiability of arithmetic constraints over infinite trees and we developed a heuristic algorithm for satisfiability of these constraints. We proved the soundness of the type inference algorithm and developed an OCaml implementation and experimental evaluation that shows that we can compute linear upper-bounds to the heap-space requirements of many programs, including sorting algorithms for lists such as insertion sort and merge sort and also programs that contain different interacting objects that describe real-life scenarios like a bank account. Another contribution of this thesis is a type checking algorithm for the RAJA system that is useful for verifying the types discovered by the type inference by using the emph{proof carrying code} technology.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
A generally observable trend of the past 10 years is that the amount of sensors embedded in mobile devices such as smart phones and tablets is rising steadily. Arguably, the available sensors are mostly underutilized by existing mobile user interfaces. In this dissertation, we explore sensor-based user interface concepts for mobile devices with the goal of making better use of the available sensing capabilities on mobile devices as well as gaining insights on the types of sensor technologies that could be added to future mobile devices. We are particularly interested how novel sensor technologies could be used to implement novel and engaging mobile user interface concepts. We explore three particular areas of interest for research into sensor-based user interface concepts for mobile devices: continuous interaction, around-device interaction and motion gestures. For continuous interaction, we explore the use of dynamic state-space systems to implement user interfaces based on a constant sensor data stream. In particular, we examine zoom automation in tilt-based map scrolling interfaces. We show that although fully automatic zooming is desirable in certain situations, adding a manual override capability of the zoom level (Semi-Automatic Zooming) will increase the usability of such a system, as shown through a decrease in task completion times and improved user ratings of user study. The presented work on continuous interaction also highlights how the sensors embedded in current mobile devices can be used to support complex interaction tasks. We go on to introduce the concept of Around-Device Interaction (ADI). By extending the interactive area of the mobile device to its entire surface and the physical volume surrounding it we aim to show how the expressivity and possibilities of mobile input can be improved this way. We derive a design space for ADI and evaluate three prototypes in this context. HoverFlow is a prototype allowing coarse hand gesture recognition around a mobile device using only a simple set of sensors. PalmSpace a prototype exploring the use of depth cameras on mobile devices to track the user's hands in direct manipulation interfaces through spatial gestures. Lastly, the iPhone Sandwich is a prototype supporting dual-sided pressure-sensitive multi-touch interaction. Through the results of user studies, we show that ADI can lead to improved usability for mobile user interfaces. Furthermore, the work on ADI contributes suggestions for the types of sensors could be incorporated in future mobile devices to expand the input capabilities of those devices. In order to broaden the scope of uses for mobile accelerometer and gyroscope data, we conducted research on motion gesture recognition. With the aim of supporting practitioners and researchers in integrating motion gestures into their user interfaces at early development stages, we developed two motion gesture recognition algorithms, the $3 Gesture Recognizer and Protractor 3D that are easy to incorporate into existing projects, have good recognition rates and require a low amount of training data. To exemplify an application area for motion gestures, we present the results of a study on the feasibility and usability of gesture-based authentication. With the goal of making it easier to connect meaningful functionality with gesture-based input, we developed Mayhem, a graphical end-user programming tool for users without prior programming skills. Mayhem can be used to for rapid prototyping of mobile gestural user interfaces. The main contribution of this dissertation is the development of a number of novel user interface concepts for sensor-based interaction. They will help developers of mobile user interfaces make better use of the existing sensory capabilities of mobile devices. Furthermore, manufacturers of mobile device hardware obtain suggestions for the types of novel sensor technologies that are needed in order to expand the input capabilities of mobile devices. This allows the implementation of future mobile user interfaces with increased input capabilities, more expressiveness and improved usability.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Ortsbezogene Dienste (Location-based Services, LBS) sind in den letzten Jahren durch die weite Verfügbarkeit von GPS zu einem Massenphänomen gewachsen. Insbesondere für die Steuerung von mobilen Endgeräten wird mehr und mehr Kontextinformation hinzugenommen, da sowohl die Bedienbarkeit als auch die Informationsdichte auf den kleinen Smartphones im Kontrast zur Informationsfülle des Internets stehen. Daher werden vielfach Dienste nicht mehr allein auf der Nutzereingabe basierend erbracht (reaktiv), sondern bieten dem Nutzer relevante Informationen vollautomatisch und kontextabhängig (proaktiv) an. Durch die proaktive Diensterbringung und ortsbezogene Personalisierung wird die Benutzbarkeit der Dienste verbessert. Leider kann man derzeit solche Dienste nur außerhalb von Gebäuden anbieten, da zum Einen kein einheitliches und günstiges Positionierungssystem verfügbar ist, und zum Anderen die notwendigen Kartendaten nicht vorliegen. Vor allem bei den Kartendaten fehlt es an einfachen Standards und Tools, die es dem Eigentümer eines Gebäudes ermöglichen, qualitativ hochwertige Kartendaten zur Verfügung zu stellen. In der vorliegenden Dissertation werden einige notwendige Schritte zur Ermöglichung ubiquitärer und skalierbarer Indoornavigation vorgestellt. Hierbei werden die Themenfelder Positionierung, Modellierung und Kartographie, sowie Navigation und Wegfindung in einen umfassenden Zusammenhang gestellt, um so eine tragfähige, einfache und skalierbare Navigation zu ermöglichen. Zunächst werden einige Verbesserungen an Terminal-basierten WLAN-Indoorpositionierungssystemen vorgeschlagen und diskutiert, die teils auf der Verwendung neuer Sensorik aktueller Smartphones basieren, und teils auf der Verwendung besonders kompakter Umgebungsmodelle auf mobilen Endgeräten. Insbesondere werden Verfahren vorgeschlagen und evaluiert, mit denen sich aus typischen CAD-Daten mit geringem Bearbeitungsaufwand die notwendigen Zusatzinformationen für eine flächendeckende Indoornavigation modellieren lassen. Darüber hinaus werden Methoden untersucht, die diese semantischen Erweiterungen teil- bzw. vollautomatisch aus Zeichnungen extrahieren können. Ausgehend von dem Ziel, flächendeckende Navigation basierend auf CAD-Daten zu ermöglichen, stößt man auf das Problem, eine Menge an interessanten Punkten so zu ordnen, dass der Reiseweg kurz ist.Dieses Problem ist mit dem NP-vollständigen Travelling-Salesman-Problem verwandt. Allerdings ist die geometrische Situation in Gebäuden derart komplex, dass sich die meisten derzeit bekannten heuristischen Lösungsalgorithmen für das Travelling-Salesman-Problem nicht ohne Weiteres auf die Situation im Inneren von Gebäuden übertragen lassen. Für dieses Problem wird ein heuristischer Algorithmus angegeben, der in linearer Laufzeit kleinere Instanzen des Problems mit akzeptablen Fehlern lösen kann. Diese Verfahren wurden im Rahmen eines Projekts mit dem Flughafen München erarbeitet und umgesetzt. In diesem Projekt sollten die ungelösten Probleme einer bereits existierenden kleinflächigen Demonstrator-Implementierung eines Fluggastinformationssystems gelöst werden. Auf diesen Algorithmen und Verfahren basiert die Navigationskomponente des Fluggastinformationssystems InfoGate, das die flächendeckende Navigation in den Terminals des Flughafen München mit verteilten Anzeigesystemen ermöglicht und seit dem 6. Juni 2011 im Produktivbetrieb ist. So konnten die Verfahren dieser Arbeit in einer real existierenden, komplexen Umgebung evaluiert werden.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The interconnection of vehicles is an important topic concerning the enhancement of traffic safety and trffic efficiency. Therefore, vehicles exchange position and state information with each other to establish an awareness level for the calculation of collision probabilities with their neighbors. To recognize critical safety situations it is essential to receive information reliably. However, these systems are typically based on wireless ad-hoc networks that cannot guarantee reliable packet transmission. This is especially the case in situations where a high number of communication nodes exist. The aim of this work at hand is the definition of a beaconing algorithm that enables the establishment of a desired awareness level for critical safety situations especially in high density traffic scenarios. First, representative scenarios for collision detection and collision avoidance were specified and metrics for the evaluation of the beaconing algorithms were defined. Based on these metrics the performance of beaconing algorithms with different static periodical update rates was evaluated. It is presented that these kinds of beaconing algorithms cannot provide sfficient results with respect to the required constant information throughput in high density traffic situations. To provide a high awareness level for each vehicle in its individual situation in spite of an unreliable communication channel a service-oriented beaconing approach is dened in this work. It is based on a request/response communication scheme to compensate particular packet loss. Hereby, a broadcast and a unicast occurrence of the algorithm are defined accordingly to the corresponding representative scenarios. It is presented that the service-oriented beaconing approach provides a signicant benefit with respect to the information throughput for low and middle traffic density situations. However, in high density situations the benefit of this approach is decreasing due to the increased communication overhead. This is a result of using one single communication channel. To achieve a high awareness level also in high density trac situations, a signi- cant modification was defined. Therefore, the usage of multiple communication channels was introduced to distribute the communication load over several channels. It is specified to send all service responses on a dedicated service channel to reduce the load on the control channel where the service requests are transmitted. After an extensive evaluation of the multi-channel utilization in vehicle ad-hoc networks using IEEE 802.11p it is shown that the multi-channel version of the service-oriented beaconing approach can achieve significant benefits concerning the information throughput for the collision detection and collision avoidance scenarios even in high density traffic situations.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The human-machine interfaces (HMIs) of today’s premium automotive infotainment systems are complex embedded systems which have special characteristics in comparison to GUIs of standard PC applications, in particular regarding their variability. The variability of infotainment system HMIs results from different car models, product series, markets, equipment configuration possibilities, system types and languages and necessitates enormous testing efforts. The model-based testing approach is a promising solution for reducing testing efforts and increasing test coverage. However, while model-based testing has been widely used for function tests of subsystems in practice, HMI tests have remained manual or only semi-automated and are very time-consuming and work-intensive. Also, it is very difficult to achieve systematic or high test coverage via manual tests. A large amount of research work has addressed GUI testing in recent years. In addition, variability is becoming an ever more popular topic in the domain of software product line development. However, a model-based testing approach for complex HMIs which also considers variability is still lacking. This thesis presents a model-based testing approach for infotainment system HMIs with the particular aim of resolving the variability problem. Furthermore, the thesis provides a foundation for future standards of HMI testing in practice. The proposed approach is based on a model-based HMI testing framework which includes two essential components: a test-oriented HMI specification and a test generation component. The test-oriented HMI specification has a layered structure and is suited to specifying data which is required for testing different features of the HMI. Both the dynamic behavior and the representation of the HMI are the testing focuses of this thesis. The test generation component automatically generates tests from the test-oriented HMI specification. Furthermore, the framework can be extended in order to automatically execute the generated tests. Generated tests must first be initialized, which means that they are enhanced with concrete user input data. Afterwards, initialized tests can be automatically executed with the help of a test execution tool which must be extended into the testing framework. In this thesis, it is proposed to specify and test different HMI-variants which have a large set of commonalities based on the software product line approach. This means the test-oriented HMI specification is extended in order to describe the commonalities and variabilities between HMI variants of an HMI product line. In particular, strategies are developed in order to generate tests for different HMI products. One special feature is that redundancies are avoided both for the test generation and the execution processes. This is especially important for the industrial practice due to limited test resources. Modeling and testing variability of automotive HMIs make up the main research contributions of this thesis. We hope that the results presented in this thesis will offer GUI testing research a solution for model-based testing of multi-variant HMIs and provide the automotive industry with a foundation for future HMI testing standards.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Over the last decades collaborative learning has gained immensely in importance and popularity due to its high potential. Unfortunately, learners rarely engage in effective learning activities unless they are provided with instructional support. In order to maximize learning outcomes it is therefore advisable to structure collaborative learning sessions. One way of doing this is using collaboration scripts, which define a sequence of activities to be carried out by the learners. The field of computer-supported collaborative learning (CSCL) produced a variety of collaboration scripts that proved to have positive effects on learning outcomes. These scripts provide detailed descriptions of successful learning scenarios and are therefore used as foundation for this thesis. In many cases computers are used to support collaborative learning. Traditional personal computers are often chosen for this purpose. However, during the last decades new technologies have emerged, which seem to be better suited for co-located collaboration than personal computers. Large interactive displays, for example, allow a number of people to work simultaneously on the same surface while being highly aware of the co-learners' actions. There are also multi-display environments that provide several workspaces, some of which may be shared, others may be personal. However, there is a lack of knowledge regarding the influence of different display types on group processes. For instance, it remains unclear in which cases shareable user interfaces should replace traditional single-user devices and when both personal and shared workspaces should be provided. This dissertation therefore explores the role of personal and shared workspaces in various situations in the area of collaborative learning. The research questions include the choice of technological devices, the seating arrangement as well as how user interfaces can be designed to guide learners. To investigate these questions a two-fold approach was chosen. First, a framework was developed, which supports the implementation of scripted collaborative learning applications. Second, different prototypes were implemented to explore the research questions. Each prototype is based on at least one collaboration script. The result is a set of studies, which contribute to answering the above-mentioned research questions. With regard to the choice of display environment the studies showed several reasons for integrating personal devices such as laptops. Pure tabletop applications with around-the-table seating arrangements whose benefits for collaboration are widely discussed in the relevant literature revealed severe drawbacks for text-based learning activities. The combination of laptops and an interactive wall display, on the other hand, turned out to be a suitable display environment for collaborative learning in several cases. In addition, the thesis presents several ways of designing the user interface in a way that guides learners through collaboration scripts.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
With the growing amount of data handled by Internet-enabled mobile devices, the task of preventing software from leaking confidential information is becoming increasingly important. At the same time, mobile applications are typically executed on different devices whose users have varying requirements for the privacy of their data. Users should be able to define their personal information security settings, and they should get a reliable assurance that the installed software respects these settings. Language-based information flow security focuses on the analysis of programs to determine information flows among accessed data resources of different security levels, and to verify and formally certify that these flows follow a given policy. In the mobile code scenario, however, both the dynamic aspect of the security environment and the fact that mobile software is distributed as bytecode pose a challenge for existing static analysis approaches. This thesis presents a language-based mechanism to certify information flow security in the presence of dynamic environments. An object-oriented high-level language as well as a bytecode language are equipped with facilities to inspect user-defined information flow security settings at runtime. This way, the software developer can create privacy-aware programs that can adapt their behaviour to arbitrary security environments, a property that is formalized as "universal noninterference". This property is statically verified by an information flow type system that uses restrictive forms of dependent types to judge abstractly on the concrete security policy that is effective at runtime. To verify compiled bytecode programs, a low-level version of the type system is presented that works on an intermediate code representation in which the original program structure is partially restored. Rigorous soundness proofs and a type-preserving compilation enable the generation of certified bytecode programs in the style of proof-carrying code. To show the practical feasibility of the approach, the system is implemented and demonstrated on a concrete application scenario, where personal data are sent from a mobile device to a server on the Internet.
Fakultät für Psychologie und Pädagogik - Digitale Hochschulschriften der LMU
Web design skills are an important component of media literacy. The aim of our study was to promote university students’ web design skills through online design-based learning (DBL). Combined in a 2x2-factorial design, two types of scaffolding were implemented in an online DBL environment to support the students through their effort to design, build, modify, and publish web sites on processes and outcomes measures, namely collaboration scripts and incomplete concept maps. The results showed that both treatments had positive effects on collaborative (content-related discourse quality, collaboration skills, and quality of published web sites) and individual (domain-specific knowledge and skills related to the design and building of websites) learning outcomes. There was synergism between the two scaffolds in that the combination of the collaboration script and incomplete concept maps produced the most positive results. To be effective, online DBL thus needs to be enhanced by appropriate scaffolds, and both collaboration scripts and incomplete concept maps are effective examples.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The present thesis compares two computational interpretations of non-constructive proofs: refined A-translation and Gödel's functional "Dialectica" interpretation. The behaviour of the extraction methods is evaluated in the light of several case studies, where the resulting programs are analysed and compared. It is argued that the two interpretations correspond to specific backtracking implementations and that programs obtained via the refined A-translation tend to be simpler, faster and more readable than programs obtained via Gödel's interpretation. Three layers of optimisation are suggested in order to produce faster and more readable programs. First, it is shown that syntactic repetition of subterms can be reduced by using let-constructions instead of meta substitutions abd thus obtaining a near linear size bound of extracted terms. The second improvement allows declaring syntactically computational parts of the proof as irrelevant and that this can be used to remove redundant parameters, possibly improving the efficiency of the program. Finally, a special case of induction is identified, for which a more efficient recursive extracted term can be defined. It is shown the outcome of case distinctions can be memoised, which can result in exponential improvement of the average time complexity of the extracted program.
Fakultät für Sprach- und Literaturwissenschaften - Digitale Hochschulschriften der LMU
Gegenstand dieser Dissertation ist die Weiterentwicklung des bereits bestehenden EMU Speech Database Systems, das Forscher und Studierende der Phonetik und Linguistik beim Aufbau, der Etikettierung, der Abfrage sowie der Analyse von Sprachdatenbanken unterstützt. EMU kann akustische als auch artikulatorische Daten verarbeiten und sticht besonders durch die umfangreichen Strukturierungsmöglichkeiten für die Etikettierung, der darauf angepassten Abfragesprache, der Schnittstellte zu R und durch die zentrale Verwaltung der Daten gegenüber anderer vergleichbarer Software hervor. In Hinsicht auf die Benutzerfreundlichkeit wurden gänzlich fehlende oder lückenhafte Dokumentationen des Systems für interne als auch externe Strukturen erstellt. Für bereits vorhandene Funktionen, die nur textbasiert verwendet werden konnten, wurden grafische Benutzerschnittstellen implementiert und die Verwendung somit vereinfacht. Im Rahmen der Funktionalitätserweiterung wurden u. a. die Scripting-Möglichkeiten des Systems auf verschiedene Weisen erweitert und grundsätzlich neue Funktionen hinzugefügt. Die Erweiterung der Interoperabilität wurde durch eine Funktionalität zum Datenbankenimport und -export über indirekte und direkte Schnittstellen zu den Programmen Praat, WaveSurfer sowie Articulate Assistant realisiert. Weiterhin wurde das Kiel Corpus in eine EMU-Sprachdatenbank überführt. In dieser Form ist es nunmehr in einer hierarchischen Etikettierungsstruktur sehr gut darstellbar, benutzerfreundlich abfragbar und damit einhergehend analysierbar. Für die weitere Entwicklung des Systems wird die Überführung des EMU-Datenmodells in ein relationales vorgeschlagen, da bei geeigneter Modellierung nachweislich die Abfrage auf große Datenkorpora zeitlich effizienter wird und die Abfragemöglichkeiten erweitert werden. Die Weiterentwicklungen haben die Software für die Lehre und Forschung effizient einsetzbar gemacht. EMU kann damit für sehr viel mehr Anwender einen wichtigen Arbeitsschritt innerhalb des eigenen Arbeitsprozesses zur Lösung von Forschungsfragen darstellen.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
In biological research, diverse high-throughput techniques enable the investigation of whole systems at the molecular level. The development of new methods and algorithms is necessary to analyze and interpret measurements of gene and protein expression and of interactions between genes and proteins. One of the challenges is the integrated analysis of gene expression and the associated regulation mechanisms. The two most important types of regulators, transcription factors (TFs) and microRNAs (miRNAs), often cooperate in complex networks at the transcriptional and post-transcriptional level and, thus, enable a combinatorial and highly complex regulation of cellular processes. For instance, TFs activate and inhibit the expression of other genes including other TFs whereas miRNAs can post-transcriptionally induce the degradation of transcribed RNA and impair the translation of mRNA into proteins. The identification of gene regulatory networks (GRNs) is mandatory in order to understand the underlying control mechanisms. The expression of regulators is itself regulated, i.e. activating or inhibiting regulators in varying conditions and perturbations. Thus, measurements of gene expression following targeted perturbations (knockouts or overexpressions) of these regulators are of particular importance. The prediction of the activity states of the regulators and the prediction of the target genes are first important steps towards the construction of GRNs. This thesis deals with these first bioinformatics steps to construct GRNs. Targets of TFs and miRNAs are determined as comprehensively and accurately as possible. The activity state of regulators is predicted for specific high-throughput data and specific contexts using appropriate statistical approaches. Moreover, (parts of) GRNs are inferred, which lead to explanations of given measurements. The thesis describes new approaches for these tasks together with accompanying evaluations and validations. This immediately defines the three main goals of the current thesis: 1. The development of a comprehensive database of regulator-target relation. Regulators and targets are retrieved from public repositories, extracted from the literature via text mining and collected into the miRSel database. In addition, relations can be predicted using various published methods. In order to determine the activity states of regulators (see 2.) and to infer GRNs (3.) comprehensive and accurate regulator-target relations are required. It could be shown that text mining enables the reliable extraction of miRNA, gene, and protein names as well as their relations from scientific free texts. Overall, the miRSel contains about three times more relations for the model organisms human, mouse, and rat as compared to state-of-the-art databases (e.g. TarBase, one of the currently most used resources for miRNA-target relations). 2. The prediction of activity states of regulators based on improved target sets. In order to investigate mechanisms of gene regulation, the experimental contexts have to be determined in which the respective regulators become active. A regulator is predicted as active based on appropriate statistical tests applied to the expression values of its set of target genes. For this task various gene set enrichment (GSE) methods have been proposed. Unfortunately, before an actual experiment it is unknown which genes are affected. The missing standard-of-truth so far has prevented the systematic assessment and evaluation of GSE tests. In contrast, the trigger of gene expression changes is of course known for experiments where a particular regulator has been directly perturbed (i.e. by knockout, transfection, or overexpression). Based on such datasets, we have systematically evaluated 12 current GSE tests. In our analysis ANOVA and the Wilcoxon test performed best. 3. The prediction of regulation cascades. Using gene expression measurements and given regulator-target relations (e.g. from the miRSel database) GRNs are derived. GSE tests are applied to determine TFs and miRNAs that change their activity as cellular response to an overexpressed miRNA. Gene regulatory networks can constructed iteratively. Our models show how miRNAs trigger gene expression changes: either directly or indirectly via cascades of miRNA-TF, miRNA-kinase-TF as well as TF-TF relations. In this thesis we focus on measurements which have been obtained after overexpression of miRNAs. Surprisingly, a number of cancer relevant miRNAs influence a common core of TFs which are involved in processes such as proliferation and apoptosis.
Fakultät für Sprach- und Literaturwissenschaften - Digitale Hochschulschriften der LMU
Die Dissertation 'Antikerezeption im Internet' untersucht, wie das Wissen über die Antike im Medium Internet erscheint. Zunächst wird mit Rekurs auf die konstruktivistische Systemtheorie ein Therorierahmen skizziert; ferner werden einige zentrale Merkmale des Internet dargestellt. Ein Forschungsbericht versucht, die Diskussionen über die Antikerezeption im Internet zu bündeln. Die Einzeluntersuchungen legen ihren Schwerpunkt auf die Online-Bibliotheken der lateinischen Literatur, auf die Online-Lexika, zuvörderst die Wikipedia, aber es kommen auch neuartige Medienformate wie Online-Grammatiken zur Sprache. Ein weiterer Schwerpunkt liegt auf der Untersuchung bestimmter sozialer Systeme und ihrer sprezifischen Rezeption der Antike: Hier kommen die Wissenschaft, die Schule, die politischen Internetseiten, die Massenmedien und die Internetkunst zur Sprache. Schließlich wird am Beispiel einer Internetrecherche zu Ciceros philosophischen Schriften ein abschließender Überblick über die Möglichkeiten gesucht, aus dem Internet Wissen über die Antike zu gewinnen.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Biological studies across all omics fields generate vast amounts of data. To understand these complex data, biologically motivated data mining techniques are indispensable. Evaluation of the high-throughput measurements usually relies on the identification of underlying signals as well as shared or outstanding characteristics. Therein, methods have been developed to recover source signals of present datasets, reveal objects which are more similar to each other than to other objects as well as to detect observations which are in contrast to the background dataset. Biological problems got individually addressed by using solutions from computer science according to their needs. The study of protein-protein interactions (interactome) focuses on the identification of clusters, the sub-graphs of graphs: A parameter-free graph clustering algorithm was developed, which was based on the concept of graph compression, in order to find sets of highly interlinked proteins sharing similar characteristics. The study of lipids (lipidome) calls for co-regulation analyses: To reveal those lipids similarly responding to biological factors, partial correlations were generated with differential Gaussian Graphical Models while accounting for solely disease-specific correlations. The study on single cell level (cytomics) aims to understand cellular systems often with the help of microscopy techniques: A novel noise robust source separation technique allowed to reliably extract independent components from microscopy images describing protein behaviors. The study of peptides (peptidomics) often requires the detection outstanding observations: By assessing regularities in the data set, an outlier detection algorithm was implemented based on compression efficacy of independent components of the dataset. All developed algorithms had to fulfill most diverse constraints in each omics field, but were met with methods derived from standard correlation and dependency analyses.
Fakultät für Psychologie und Pädagogik - Digitale Hochschulschriften der LMU
Wikis have been used increasingly as a tool for collaborative authoring and knowledge management (KM) in recent years. One possible application is the collective documentation of technical knowledge and experience within a corporation or its departments. However, just like with other new technologies, the success of wikis depends on the intended users’ acceptance, which cannot always be presupposed. Therefore, the present work focused on the acceptance of introducing wiki technology in an existing corporate knowledge management (KM) system at an international automotive supplier enterprise. Drawing on theories of technology acceptance (e.g. Venkatesh & Davis, 2000), KM research on virtual communities of practice, and motivation theory, a theoretical framework model was developed. It distinguishes individual and organizational factors and wiki features as potential influences on successful use of wikis in corporations. The model was used as framework for two studies investigating users’ acceptance of a newly introduced wiki on corporate level as well as wikis on departmental level. A third study analyzed the wiki’s contribution to overall corporate KM activities. The wiki had been used to build a corporate lexicon, available for more than 120,000 employees worldwide. Methodologically, the studies incorporated surveys among wiki users and log file analyses. The main findings emphasize former research results about motivation and perceived ease of use as key influences on acceptance of new technology. However, further evidence identified users’ expectancies as an additional important factor. Concerning the wiki’s contribution to KM, hopes of using it as a tool to foster generation of new knowledge were dampened, as it was used mainly for the representation of existing knowledge. Overall, the study was able to identify a number of factors for the successful introduction of wikis as collaborative authoring tools in a corporate setting.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Tue, 17 Jan 2012 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/16585/ https://edoc.ub.uni-muenchen.de/16585/1/Petri_Marisa.pdf Petri, Marisa ddc:004, ddc:000, Fakultät für Mathematik, Informatik und Statistik
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The touchscreen, as an alternative user interface for applications that normally require mice and keyboards, has become more and more commonplace, showing up on mobile devices, on vending machines, on ATMs and in the control panels of machines in industry, where conventional input devices cannot provide intuitive, rapid and accurate user interaction with the content of the display. The exponential growth in processing power on the PC, together with advances in understanding human communication channels, has had a significant effect on the design of usable, human-factored interfaces on touchscreens, and on the number and complexity of applications available on touchscreens. Although computer-driven touchscreen interfaces provide programmable and dynamic displays, the absence of the expected tactile cues on the hard and static surfaces of conventional touchscreens is challenging interface design and touchscreen usability, in particular for distracting, low-visibility environments. Current technology allows the human tactile modality to be used in touchscreens. While the visual channel converts graphics and text unidirectionally from the computer to the end user, tactile communication features a bidirectional information flow to and from the user as the user perceives and acts on the environment and the system responds to changing contextual information. Tactile sensations such as detents and pulses provide users with cues that make selecting and controlling a more intuitive process. Tactile features can compensate for deficiencies in some of the human senses, especially in tasks which carry a heavy visual or auditory burden. In this study, an interaction concept for tactile touchscreens is developed with a view to employing the key characteristics of the human sense of touch effectively and efficiently, especially in distracting environments where vision is impaired and hearing is overloaded. As a first step toward improving the usability of touchscreens through the integration of tactile effects, different mechanical solutions for producing motion in tactile touchscreens are investigated, to provide a basis for selecting suitable vibration directions when designing tactile displays. Building on these results, design know-how regarding tactile feedback patterns is further developed to enable dynamic simulation of UI controls, in order to give users a sense of perceiving real controls on a highly natural touch interface. To study the value of adding tactile properties to touchscreens, haptically enhanced UI controls are then further investigated with the aim of mapping haptic signals to different usage scenarios to perform primary and secondary tasks with touchscreens. The findings of the study are intended for consideration and discussion as a guide to further development of tactile stimuli, haptically enhanced user interfaces and touchscreen applications.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Physiological computing is an interesting and promising concept to widen the communication channel between the (human) users and computers, thus allowing an increase of software systems' contextual awareness and rendering software systems smarter than they are today. Using physiological inputs in pervasive computing systems allows re-balancing the information asymmetry between the human user and the computer system: while pervasive computing systems are well able to flood the user with information and sensory input (such as sounds, lights, and visual animations), users only have a very narrow input channel to computing systems; most of the time, restricted to keyboards, mouse, touchscreens, accelerometers and GPS receivers (through smartphone usage, e.g.). Interestingly, this information asymmetry often forces the user to subdue to the quirks of the computing system to achieve his goals -- for example, users may have to provide information the software system demands through a narrow, time-consuming input mode that the system could sense implicitly from the human body. Physiological computing is a way to circumvent these limitations; however, systematic means for developing and moulding physiological computing applications into software are still unknown. This thesis proposes a methodological approach to the creation of physiological computing applications that makes use of component-based software engineering. Components help imposing a clear structure on software systems in general, and can thus be used for physiological computing systems as well. As an additional bonus, using components allow physiological computing systems to leverage reconfigurations as a means to control and adapt their own behaviours. This adaptation can be used to adjust the behaviour both to the human and to the available computing environment in terms of resources and available devices - an activity that is crucial for complex physiological computing systems. With the help of components and reconfigurations, it is possible to structure the functionality of physiological computing applications in a way that makes them manageable and extensible, thus allowing a stepwise and systematic extension of a system's intelligence. Using reconfigurations entails a larger issue, however. Understanding and fully capturing the behaviour of a system under reconfiguration is challenging, as the system may change its structure in ways that are difficult to fully predict. Therefore, this thesis also introduces a means for formal verification of reconfigurations based on assume-guarantee contracts. With the proposed assume-guarantee contract framework, it is possible to prove that a given system design (including component behaviours and reconfiguration specifications) is satisfying real-time properties expressed as assume-guarantee contracts using a variant of real-time linear temporal logic introduced in this thesis - metric interval temporal logic for reconfigurable systems. Finally, this thesis embeds both the practical approach to the realisation of physiological computing systems and formal verification of reconfigurations into Scrum, a modern and agile software development methodology. The surrounding methodological approach is intended to provide a frame for the systematic development of physiological computing systems from first psychological findings to a working software system with both satisfactory functionality and software quality aspects. By integrating practical and theoretical aspects of software engineering into a self-contained development methodology, this thesis proposes a roadmap and guidelines for the creation of new physiological computing applications.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Knowledge extraction from structured data aims for identifying valid, novel, potentially useful, and ultimately understandable patterns in the data. The core step of this process is the application of a data mining algorithm in order to produce an enumeration of particular patterns and relationships in large databases. Clustering is one of the major data mining tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects within clusters is maximized, and the similarity of objects from different clusters is minimized. In this thesis, we advance the state-of-the-art data mining algorithms for analyzing structured data types. We describe the development of innovative solutions for hierarchical data mining. The EM-based hierarchical clustering method ITCH (Information-Theoretic Cluster Hierarchies) is designed to propose solid solutions for four different challenges. (1) to guide the hierarchical clustering algorithm to identify only meaningful and valid clusters. (2) to represent each cluster content in the hierarchy by an intuitive description with e.g. a probability density function. (3) to consistently handle outliers. (4) to avoid difficult parameter settings. ITCH is built on a hierarchical variant of the information-theoretic principle of Minimum Description Length (MDL). Interpreting the hierarchical cluster structure as a statistical model of the dataset, it can be used for effective data compression by Huffman coding. Thus, the achievable compression rate induces a natural objective function for clustering, which automatically satisfies all four above mentioned goals. The genetic-based hierarchical clustering algorithm GACH (Genetic Algorithm for finding Cluster Hierarchies) overcomes the problem of getting stuck in a local optimum by a beneficial combination of genetic algorithms, information theory and model-based clustering. Besides hierarchical data mining, we also made contributions to more complex data structures, namely objects that consist of mixed type attributes and skyline objects. The algorithm INTEGRATE performs integrative mining of heterogeneous data, which is one of the major challenges in the next decade, by a unified view on numerical and categorical information in clustering. Once more, supported by the MDL principle, INTEGRATE guarantees the usability on real world data. For skyline objects we developed SkyDist, a similarity measure for comparing different skyline objects, which is therefore a first step towards performing data mining on this kind of data structure. Applied in a recommender system, for example SkyDist can be used for pointing the user to alternative car types, exhibiting a similar price/mileage behavior like in his original query. For mining graph-structured data, we developed different approaches that have the ability to detect patterns in static as well as in dynamic networks. We confirmed the practical feasibility of our novel approaches on large real-world case studies ranging from medical brain data to biological yeast networks. In the second part of this thesis, we focused on boosting the knowledge extraction process. We achieved this objective by an intelligent adoption of Graphics Processing Units (GPUs). The GPUs have evolved from simple devices for the display signal preparation into powerful coprocessors that do not only support typical computer graphics tasks but can also be used for general numeric and symbolic computations. As major advantage, GPUs provide extreme parallelism combined with a high bandwidth in memory transfer at low cost. In this thesis, we propose algorithms for computationally expensive data mining tasks like similarity search and different clustering paradigms which are designed for the highly parallel environment of a GPU, called CUDA-DClust and CUDA-k-means. We define a multi-dimensional index structure which is particularly suited to support similarity queries under the restricted programming model of a GPU. We demonstrate the superiority of our algorithms running on GPU over their conventional counterparts on CPU in terms of efficiency.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Variable selection and model choice are of major concern in many statistical applications, especially in regression models for high-dimensional data. Boosting is a convenient statistical method that combines model fitting with intrinsic model selection. We investigate the impact of base-learner specification on the performance of boosting as a model selection procedure. We show that variable selection may be biased if the base-learners have different degrees of flexibility, both for categorical covariates and for smooth effects of continuous covariates. We investigate these problems from a theoretical perspective and suggest a framework for unbiased model selection based on a general class of penalized least squares base-learners. Making all base-learners comparable in terms of their degrees of freedom strongly reduces the selection bias observed with naive boosting specifications. Furthermore, the definition of degrees of freedom that is used in the smoothing literature is questionable in the context of boosting, and an alternative definition is theoretically derived. The importance of unbiased model selection is demonstrated in simulations and in an application to forest health models. A second aspect of this thesis is the expansion of the boosting algorithm to new estimation problems: by using constraint base-learners, monotonicity constrained effect estimates can be seamlessly incorporated in the existing boosting framework. This holds for both, smooth effects and ordinal variables. Furthermore, cyclic restrictions can be integrated in the model for smooth effects of continuous covariates. In particular in time-series models, cyclic constraints play an important role. Monotonic and cyclic constraints of smooth effects can, in addition, be extended to smooth, bivariate function estimates. If the true effects are monotonic or cyclic, simulation studies show that constrained estimates are superior to unconstrained estimates. In three case studies (the modeling the presence of Red Kite in Bavaria, the modeling of activity profiles for Roe Deer, and the modeling of deaths caused by air pollution in Sao Paulo) it is shown that both constraints can be integrated in the boosting framework and that they are easy to use. All described results were included in the R add-on package mboost.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Advances of modern technologies produce huge amounts of data in various fields, increasing the need for efficient and effective data mining tools to uncover the information contained implicitly in the data. This thesis mainly aims to propose innovative and solid algorithms for data mining from a novel perspective: synchronization. Synchronization is a prevalent phenomenon in nature that a group of events spontaneously come into co-occurrence with a common rhythm through mutual interactions. The mechanism of synchronization allows controlling of complex processes by simple operations based on interactions between objects. The first main part of this thesis focuses on developing the innovative algorithms for data mining. Inspired by the concept of synchronization, this thesis presents Sync (Clustering by Synchronization), a novel approach to clustering. In combination with the Minimum Description Length principle (MDL), it allows discovering the intrinsic clusters without any data distribution assumptions and parameters setting. In addition, relying on the dierent dynamic behaviors of objects during the process towards synchronization,the algorithm SOD (Synchronization-based Outlier Detection) is further proposed. The outlier objects can be naturally flagged by the denition of Local Synchronization Factor (LSF). To cure the curse of dimensionality in clustering,a subspace clustering algorithm ORSC is introduced which automatically detects clusters in subspaces of the original feature space. This approach proposes a weighted local interaction model to ensure all objects in a common cluster, which accommodate in arbitrarily oriented subspace, naturally move together. In order to reveal the underlying patterns in graphs, a graph partitioning approach RSGC (Robust Synchronization-based Graph Clustering) is presented. The key philosophy of RSGC is to consider graph clustering as a dynamic process towards synchronization. Inherited from the powerful concept of synchronization, RSGC shows several desirable properties that don't exist in other competitive methods. For all presented algorithms, their efficiency and eectiveness are thoroughly analyzed. The benets over traditional approaches are further demonstrated by evaluating them on synthetic as well as real-world data sets. Not only the theory research on novel data mining algorithms, the second main part of the thesis focuses on brain network analysis based on Diusion Tensor Images (DTI). A new framework for automated white matter tracts clustering is rst proposed to identify the meaningful ber bundles in the Human Brain by combining ideas from time series mining with density-based clustering. Subsequently, the enhancement and variation of this approach is discussed allowing for a more robust, efficient, or eective way to find hierarchies of ber bundles. Based on the structural connectivity network, an automated prediction framework is proposed to analyze and understand the abnormal patterns in patients of Alzheimer's Disease.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The ongoing automation in our modern information society leads to a tremendous rise in the amount as well as complexity of collected data. In medical imaging for example the electronic availability of extensive data collected as part of clinical trials provides a remarkable potentiality to detect new relevant features in complex diseases like brain tumors. Using data mining applications for the analysis of the data raises several problems. One problem is the localization of outstanding observations also called outliers in a data set. In this work a technique for parameter-free outlier detection, which is based on data compression and a general data model which combines the Generalized Normal Distribution (GND) with independent components, to cope with existing problems like parameter settings or implicit data distribution assumptions, is proposed. Another problem in many modern applications amongst others in medical imaging is the efficient similarity search in uncertain data. At present, an adequate therapy planning of newly detected brain tumors assumedly of glial origin needs invasive biopsy due to the fact that prognosis and treatment, both vary strongly for benign, low-grade, and high-grade tumors. To date differentiation of tumor grades is mainly based on the expertise of neuroradiologists examining contrast-enhanced Magnetic Resonance Images (MRI). To assist neuroradiologist experts during the differentiation between tumors of different malignancy we proposed a novel, efficient similarity search technique for uncertain data. The feature vector of an object is thereby not exactly known but is rather defined by a Probability Density Function (PDF) like a Gaussian Mixture Model (GMM). Previous work is limited to axis-parallel Gaussian distributions, hence, correlations between different features are not considered in these similarity searches. In this work a novel, efficient similarity search technique for general GMMs without independence assumption is presented. The actual components of a GMM are approximated in a conservative but tight way. The conservativity of the approach leads to a filter-refinement architecture, which guarantees no false dismissals and the tightness of the approximations causes good filter selectivity. An extensive experimental evaluation of the approach demonstrates a considerable speed-up of similarity queries on general GMMs. Additionally, promising results for advancing the differentiation between brain tumors of different grades could be obtained by applying the approach to four-dimensional Magnetic Resonance Images of glioma patients.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
One of the main design goals of social software, such as wikis, is to support and facilitate interaction and collaboration. This dissertation explores challenges that arise from extending social software with advanced facilities such as reasoning and semantic annotations and presents tools in form of a conceptual model, structured tags, a rule language, and a set of novel forward chaining and reason maintenance methods for processing such rules that help to overcome the challenges. Wikis and semantic wikis were usually developed in an ad-hoc manner, without much thought about the underlying concepts. A conceptual model suitable for a semantic wiki that takes advanced features such as annotations and reasoning into account is proposed. Moreover, so called structured tags are proposed as a semi-formal knowledge representation step between informal and formal annotations. The focus of rule languages for the Semantic Web has been predominantly on expert users and on the interplay of rule languages and ontologies. KWRL, the KiWi Rule Language, is proposed as a rule language for a semantic wiki that is easily understandable for users as it is aware of the conceptual model of a wiki and as it is inconsistency-tolerant, and that can be efficiently evaluated as it builds upon Datalog concepts. The requirement for fast response times of interactive software translates in our work to bottom-up evaluation (materialization) of rules (views) ahead of time – that is when rules or data change, not when they are queried. Materialized views have to be updated when data or rules change. While incremental view maintenance was intensively studied in the past and literature on the subject is abundant, the existing methods have surprisingly many disadvantages – they do not provide all information desirable for explanation of derived information, they require evaluation of possibly substantially larger Datalog programs with negation, they recompute the whole extension of a predicate even if only a small part of it is affected by a change, they require adaptation for handling general rule changes. A particular contribution of this dissertation consists in a set of forward chaining and reason maintenance methods with a simple declarative description that are efficient and derive and maintain information necessary for reason maintenance and explanation. The reasoning methods and most of the reason maintenance methods are described in terms of a set of extended immediate consequence operators the properties of which are proven in the classical logical programming framework. In contrast to existing methods, the reason maintenance methods in this dissertation work by evaluating the original Datalog program – they do not introduce negation if it is not present in the input program – and only the affected part of a predicate’s extension is recomputed. Moreover, our methods directly handle changes in both data and rules; a rule change does not need to be handled as a special case. A framework of support graphs, a data structure inspired by justification graphs of classical reason maintenance, is proposed. Support graphs enable a unified description and a formal comparison of the various reasoning and reason maintenance methods and define a notion of a derivation such that the number of derivations of an atom is always finite even in the recursive Datalog case. A practical approach to implementing reasoning, reason maintenance, and explanation in the KiWi semantic platform is also investigated. It is shown how an implementation may benefit from using a graph database instead of or along with a relational database.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
A primary feature of a computer program is its quantitative performance characteristics: the amount of resources such as time, memory, and power the program needs to perform its task. Concrete resource bounds for specific hardware have many important applications in software development but their manual determination is tedious and error-prone. This dissertation studies the problem of automatically determining concrete worst-case bounds on the quantitative resource consumption of functional programs. Traditionally, automatic resource analyses are based on recurrence relations. The difficulty of both extracting and solving recurrence relations has led to the development of type-based resource analyses that are compositional, modular, and formally verifiable. However, existing automatic analyses based on amortization or sized types can only compute bounds that are linear in the sizes of the arguments of a function. This work presents a novel type system that derives polynomial bounds from first-order functional programs. As pioneered by Hofmann and Jost for linear bounds, it relies on the potential method of amortized analysis. Types are annotated with multivariate resource polynomials, a rich class of functions that generalize non-negative linear combinations of binomial coefficients. The main theorem states that type derivations establish resource bounds that are sound with respect to the resource-consumption of programs which is formalized by a big-step operational semantics. Simple local type rules allow for an efficient inference algorithm for the type annotations which relies on linear constraint solving only. This gives rise to an analysis system that is fully automatic if a maximal degree of the bounding polynomials is given. The analysis is generic in the resource of interest and can derive bounds on time and space usage. The bounds are naturally closed under composition and eventually summarized in closed, easily understood formulas. The practicability of this automatic amortized analysis is verified with a publicly available implementation and a reproducible experimental evaluation. The experiments with a wide range of examples from functional programming show that the inference of the bounds only takes a couple of seconds in most cases. The derived heap-space and evaluation-step bounds are compared with the measured worst-case behavior of the programs. Most bounds are asymptotically tight, and the constant factors are close or even identical to the optimal ones. For the first time we are able to automatically and precisely analyze the resource consumption of involved programs such as quick sort for lists of lists, longest common subsequence via dynamic programming, and multiplication of a list of matrices with different, fitting dimensions.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The A-Translation enables us to unravel the computational information in classical proofs, by first transforming them into constructive ones, however at the cost of introducing redundancies in the extracted code. This is due to the fact that all negations inserted during translation are replaced by the computationally relevant form of the goal. In this thesis we are concerned with eliminating such redundancies, in order to obtain better extracted programs. For this, we propose two methods: a controlled and minimal insertion of negations, such that a refinement of the A-Translation can be used and an algorithmic decoration of the proofs, in order to mark the computationally irrelevant components. By restricting the logic to be minimal, the Double Negation Translation is no longer necessary. On this fragment of minimal logic we apply the refined A-Translation, as proposed in (Berget et al., 2002). This method identifies further selected classes of formulas for which the negations do not need to be substituted by computationally relevant formulas. However, the refinement imposes restrictions which considerably narrow the applicability domain of the A-Translation. We address this issue by proposing a controlled insertion of double negations, with the benefit that some intuitionistically valid Pi^0_2-formulas become provable in minimal logic and that certain formulas are transformed to match the requirements of the refined A-Translation. We present the outcome of applying the refined A-translation to a series of examples. Their purpose is two folded. On one hand, they serve as case studies for the role played by negations, by shedding a light on the restrictions imposed by the translation method. On the other hand, the extracted programs are characterized by a specific behaviour: they adhere to the continuation passing style and the recursion is in general in tail form. The second improvement concerns the detection of the computationally irrelevant subformulas, such that no terms are extracted from them. In order to achieve this, we assign decorations to the implication and universal quantifier. The algorithm that we propose is shown to be optimal, correct and terminating and is applied on the examples of factorial and list reversal.
Fakultät für Sprach- und Literaturwissenschaften - Digitale Hochschulschriften der LMU
Translation memory systems (TM systems) are software packages used in computer-assisted translation (CAT) to support human translators. As an example of successful natural language processing (NLP), these applications have been discussed in monographic works, conferences, articles in specialized journals, newsletters, forums, mailing lists, etc. This thesis focuses on how TM systems deal with placeable and localizable elements, as defined in 2.1.1.1. Although these elements are mentioned in the cited sources, there is no systematic work discussing them. This thesis is aimed at filling this gap and at suggesting improvements that could be implemented in order to tackle current shortcomings. The thesis is divided into the following chapters. Chapter 1 is a general introduction to the field of TM technology. Chapter 2 presents the conducted research in detail. The chapters 3 to 12 each discuss a specific category of placeable and localizable elements. Finally, chapter 13 provides a conclusion summarizing the major findings of this research project.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
The last years have seen a tremendous increase of data acquisition in different scientific fields such as molecular biology, bioinformatics or biomedicine. Therefore, novel methods are needed for automatic data processing and analysis of this large amount of data. Data mining is the process of applying methods like clustering or classification to large databases in order to uncover hidden patterns. Clustering is the task of partitioning points of a data set into distinct groups in order to minimize the intra cluster similarity and to maximize the inter cluster similarity. In contrast to unsupervised learning like clustering, the classification problem is known as supervised learning that aims at the prediction of group membership of data objects on the basis of rules learned from a training set where the group membership is known. Specialized methods have been proposed for hierarchical and partitioning clustering. However, these methods suffer from several drawbacks. In the first part of this work, new clustering methods are proposed that cope with problems from conventional clustering algorithms. ITCH (Information-Theoretic Cluster Hierarchies) is a hierarchical clustering method that is based on a hierarchical variant of the Minimum Description Length (MDL) principle which finds hierarchies of clusters without requiring input parameters. As ITCH may converge only to a local optimum we propose GACH (Genetic Algorithm for Finding Cluster Hierarchies) that combines the benefits from genetic algorithms with information-theory. In this way the search space is explored more effectively. Furthermore, we propose INTEGRATE a novel clustering method for data with mixed numerical and categorical attributes. Supported by the MDL principle our method integrates the information provided by heterogeneous numerical and categorical attributes and thus naturally balances the influence of both sources of information. A competitive evaluation illustrates that INTEGRATE is more effective than existing clustering methods for mixed type data. Besides clustering methods for single data objects we provide a solution for clustering different data sets that are represented by their skylines. The skyline operator is a well-established database primitive for finding database objects which minimize two or more attributes with an unknown weighting between these attributes. In this thesis, we define a similarity measure, called SkyDist, for comparing skylines of different data sets that can directly be integrated into different data mining tasks such as clustering or classification. The experiments show that SkyDist in combination with different clustering algorithms can give useful insights into many applications. In the second part, we focus on the analysis of high resolution magnetic resonance images (MRI) that are clinically relevant and may allow for an early detection and diagnosis of several diseases. In particular, we propose a framework for the classification of Alzheimer's disease in MR images combining the data mining steps of feature selection, clustering and classification. As a result, a set of highly selective features discriminating patients with Alzheimer and healthy people has been identified. However, the analysis of the high dimensional MR images is extremely time-consuming. Therefore we developed JGrid, a scalable distributed computing solution designed to allow for a large scale analysis of MRI and thus an optimized prediction of diagnosis. In another study we apply efficient algorithms for motif discovery to task-fMRI scans in order to identify patterns in the brain that are characteristic for patients with somatoform pain disorder. We find groups of brain compartments that occur frequently within the brain networks and discriminate well among healthy and diseased people.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Tue, 19 Jul 2011 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/13277/ https://edoc.ub.uni-muenchen.de/13277/1/Spiessl_Wolfgang.pdf Spießl, Wolfgang ddc:000, ddc:004, Fakultät für Mathematik, Informatik und Statistik
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Parity games form an intriguing family of infinitary payoff games whose solution is equivalent to the solution of important problems in automatic verification and automata theory. They also form a very natural subclass of mean and discounted payoff games, which in turn are very natural subclasses of turn-based stochastic payoff games. From a theoretical point of view, solving these games is one of the few problems that belong to the complexity class NP intersect coNP, and even more interestingly, solving has been shown to belong to UP intersect coUP, and also to PLS. It is a major open problem whether these game families can be solved in deterministic polynomial time. Policy iteration is one of the most important algorithmic schemes for solving infinitary payoff games. It is parameterized by an improvement rule that determines how to proceed in the iteration from one policy to the next. It is a major open problem whether there is an improvement rule that results in a polynomial time algorithm for solving one of the considered game classes. Linear programming is one of the most important computational problems studied by researchers in computer science, mathematics and operations research. Perhaps more articles and books are written about linear programming than on all other computational problems combined. The simplex and the dual-simplex algorithms are among the most widely used algorithms for solving linear programs in practice. Simplex algorithms for solving linear programs are closely related to policy iteration algorithms. Like policy iteration, the simplex algorithm is parameterized by a pivoting rule that describes how to proceed from one basic feasible solution in the linear program to the next. It is a major open problem whether there is a pivoting rule that results in a (strongly) polynomial time algorithm for solving linear programs. We contribute to both the policy iteration and the simplex algorithm by proving exponential lower bounds for several improvement resp. pivoting rules. For every considered improvement rule, we start by building 2-player parity games on which the respective policy iteration algorithm performs an exponential number of iterations. We then transform these 2-player games into 1-player Markov decision processes ii which correspond almost immediately to concrete linear programs on which the respective simplex algorithm requires the same number of iterations. Additionally, we show how to transfer the lower bound results to more expressive game classes like payoff and turn-based stochastic games. Particularly, we prove exponential lower bounds for the deterministic switch all and switch best improvement rules for solving games, for which no non-trivial lower bounds have been known since the introduction of Howard’s policy iteration algorithm in 1960. Moreover, we prove exponential lower bounds for the two most natural and most studied randomized pivoting rules suggested to date, namely the random facet and random edge rules for solving games and linear programs, for which no non-trivial lower bounds have been known for several decades. Furthermore, we prove an exponential lower bound for the switch half randomized improvement rule for solving games, which is considered to be the most important multi-switching randomized rule. Finally, we prove an exponential lower bound for the most natural and famous history-based pivoting rule due to Zadeh for solving games and linear programs, which has been an open problem for thirty years. Last but not least, we prove exponential lower bounds for two other classes of algorithms that solve parity games, namely for the model checking algorithm due to Stevens and Stirling and for the recursive algorithm by Zielonka.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Despite advances in software and hardware technologies, faults are still inevitable in a highly-dependent, human-engineered and administrated IT environment. Given the critical role of IT services today, it is imperative that faults, having once occurred, have to be dealt with eciently and eeffectively to avoid or reduce the actual losses. Nevertheless, the complexities of current IT services, e.g., with regard to their scales, heterogeneity and highly dynamic infrastructures, make the recovery operation a challenging task for operators. Such complexities will eventually outgrow the human capability to manage them. Such diculty is augmented by the fact that there are few well-devised methods available to support fault recovery. To tackle this issue, this thesis aims at providing a computer-aided approach to assist operators with fault recovery planning and, consequently, to increase the eciency of recovery activities.We propose a generic framework based on the automated planning theory to generate plans for recoveries of IT services. At the heart of the framework is a planning component. Assisted by the other participants in the framework, the planning component aggregates the relevant information and computes recovery steps accordingly. The main idea behind the planning component is to sustain the planning operations with automated planning techniques, which is one of the research fields of articial intelligence. Provided with a general planning model, we show theoretically that the service fault recovery problem can be indeed solved by automated planning techniques. The relationship between a planning problem and a fault recovery problem is shown by means of reduction between these problems. After an extensive investigation, we choose a planning paradigm that based on Hierarchical Task Networks (HTN) as the guideline for the design of our main planning algorithm called H2MAP. To sustain the operation of the planner, a set of components revolving around the planning component is provided. These components are responsible for tasks such as translation between dierent knowledge formats, persistent storage of planning knowledge and communication with external systems. To ensure extendibility in our design, we apply dierent design patterns for the components. We sketch and discuss the technical aspects of implementations of the core components. Finally, as proof of the concept, the framework is instantiated to two distinguishing application scenarios.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Usable and secure authentication is a research field that approaches different challenges related to authentication, including security, from a human-computer interaction perspective. That is, work in this field tries to overcome security, memorability and performance problems that are related to the interaction with an authentication mechanism. More and more services that require authentication, like ticket vending machines or automated teller machines (ATMs), take place in a public setting, in which security threats are more inherent than in other settings. In this work, we approach the problem of usable and secure authentication for public spaces. The key result of the work reported here is a set of well-founded criteria for the systematic evaluation of authentication mechanisms. These criteria are justified by two different types of investigation, which are on the one hand prototypical examples of authentication mechanisms with improved usability and security, and on the other hand empirical studies of security-related behavior in public spaces. So this work can be structured in three steps: Firstly, we present five authentication mechanisms that were designed to overcome the main weaknesses of related work which we identified using a newly created categorization of authentication mechanisms for public spaces. The systems were evaluated in detail and showed encouraging results for future use. This and the negative sides and problems that we encountered with these systems helped us to gain diverse insights on the design and evaluation process of such systems in general. It showed that the development process of authentication mechanisms for public spaces needs to be improved to create better results. Along with this, it provided insights on why related work is difficult to compare to each other. Keeping this in mind, first criteria were identified that can fill these holes and improve design and evaluation of authentication mechanisms, with a focus on the public setting. Furthermore, a series of work was performed to gain insights on factors influencing the quality of authentication mechanisms and to define a catalog of criteria that can be used to support creating such systems. It includes a long-term study of different PIN-entry systems as well as two field studies and field interviews on real world ATM-use. With this, we could refine the previous criteria and define additional criteria, many of them related to human factors. For instance, we showed that social issues, like trust, can highly affect the security of an authentication mechanism. We used these results to define a catalog of seven criteria. Besides their definition, we provide information on how applying them influences the design, implementation and evaluation of a the development process, and more specifically, how adherence improves authentication in general. A comparison of two authentication mechanisms for public spaces shows that a system that fulfills the criteria outperforms a system with less compliance. We could also show that compliance not only improves the authentication mechanisms themselves, it also allows for detailed comparisons between different systems.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Wed, 18 May 2011 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/13106/ https://edoc.ub.uni-muenchen.de/13106/1/Marcu_Patricia.pdf Marcu, Gabriela-Patricia ddc:000, ddc:004, Fakultät für Mathematik, Informatik und Statistik
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Thu, 17 Mar 2011 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/13028/ https://edoc.ub.uni-muenchen.de/13028/1/Scheipl_Fabian.pdf Scheipl, Fabian ddc:000, ddc:004, Fakultät für Mathematik, Informatik und Sta
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Enabling non-experts to publish data on the web is an important achievement of the social web and one of the primary goals of the social semantic web. Making the data easily accessible in turn has received only little attention, which is problematic from the point of view of incentives: users are likely to be less motivated to participate in the creation of content if the use of this content is mostly reserved to experts. Querying in semantic wikis, for example, is typically realized in terms of full text search over the textual content and a web query language such as SPARQL for the annotations. This approach has two shortcomings that limit the extent to which data can be leveraged by users: combined queries over content and annotations are not possible, and users either are restricted to expressing their query intent using simple but vague keyword queries or have to learn a complex web query language. The work presented in this dissertation investigates a more suitable form of querying for semantic wikis that consolidates two seemingly conflicting characteristics of query languages, ease of use and expressiveness. This work was carried out in the context of the semantic wiki KiWi, but the underlying ideas apply more generally to the social semantic and social web. We begin by defining a simple modular conceptual model for the KiWi wiki that enables rich and expressive knowledge representation. A component of this model are structured tags, an annotation formalism that is simple yet flexible and expressive, and aims at bridging the gap between atomic tags and RDF. The viability of the approach is confirmed by a user study, which finds that structured tags are suitable for quickly annotating evolving knowledge and are perceived well by the users. The main contribution of this dissertation is the design and implementation of KWQL, a query language for semantic wikis. KWQL combines keyword search and web querying to enable querying that scales with user experience and information need: basic queries are easy to express; as the search criteria become more complex, more expertise is needed to formulate the corresponding query. A novel aspect of KWQL is that it combines both paradigms in a bottom-up fashion. It treats neither of the two as an extension to the other, but instead integrates both in one framework. The language allows for rich combined queries of full text, metadata, document structure, and informal to formal semantic annotations. KWilt, the KWQL query engine, provides the full expressive power of first-order queries, but at the same time can evaluate basic queries at almost the speed of the underlying search engine. KWQL is accompanied by the visual query language visKWQL, and an editor that displays both the textual and visual form of the current query and reflects changes to either representation in the other. A user study shows that participants quickly learn to construct KWQL and visKWQL queries, even when given only a short introduction. KWQL allows users to sift the wealth of structure and annotations in an information system for relevant data. If relevant data constitutes a substantial fraction of all data, ranking becomes important. To this end, we propose PEST, a novel ranking method that propagates relevance among structurally related or similarly annotated data. Extensive experiments, including a user study on a real life wiki, show that pest improves the quality of the ranking over a range of existing ranking approaches.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
In this day and age, the mobile phone is becoming one of the most indispensable personal computing device. People no longer use it just for communication (i.e. calling, sending messages) but also for other aspects of their lives as well. Because of this rise in demand for different and innovative applications, mobile companies (i.e. mobile handset manufacturers and mobile network providers) and organizations have realized the power of collaborative software development and have changed their business strategy. Instead of hiring specific organizations to do programming, they are now opening up their APIs and tools to allow ordinary people create their own mobile applications either for personal use or for profit. However, the problem with this approach is that there are people who might have nice ideas of their own but do not possess the technical expertise in order to create applications implementing these ideas. The goal of this research is to find ways to simplify the creation of mobile applications for non-technical people by applying model-driven software development particularly domain-specific modeling combined with techniques from the field of human-computer interaction (HCI) particularly iterative, user-centered system design. As proof of concept, we concentrate on the development of applications in the domain of mHealth and use the Android Framework as the target platform for code generation. The iterative user-centered design and development of the front-end tool which is called the Mobia Modeler, led us to eventually create a tool that features a configurable-component based design and integrated modeless environment to simplify the different development tasks of end-users. The Mobia models feature both constructs specialized for specific domains (e.g. sensor component, special component ), and also those that are applicable to any type of domain (e.g. structure component, basic component ). In order to accommodate different needs of end-users, a clear separation between the front-end tools (i.e. Mobia Modeler ) and the underlying code generator (i.e. Mobia Processor ) is recommended as long as there is a consistent model in between, that serves as a bridge between the different tools.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel, useful and ultimately understandable patterns in data. The core step of the KDD process is the application of Data Mining (DM) algorithms to efficiently find interesting patterns in large databases. This thesis concerns itself with three inter-related themes: Generalised interaction and rule mining; the incorporation of statistics into novel data mining approaches; and probabilistic frequent pattern mining in uncertain databases. An interaction describes an effect that variables have -- or appear to have -- on each other. Interaction mining is the process of mining structures on variables describing their interaction patterns -- usually represented as sets, graphs or rules. Interactions may be complex, represent both positive and negative relationships, and the presence of interactions can influence another interaction or variable in interesting ways. Finding interactions is useful in domains ranging from social network analysis, marketing, the sciences, e-commerce, to statistics and finance. Many data mining tasks may be considered as mining interactions, such as clustering; frequent itemset mining; association rule mining; classification rules; graph mining; flock mining; etc. Interaction mining problems can have very different semantics, pattern definitions, interestingness measures and data types. Solving a wide range of interaction mining problems at the abstract level, and doing so efficiently -- ideally more efficiently than with specialised approaches, is a challenging problem. This thesis introduces and solves the Generalised Interaction Mining (GIM) and Generalised Rule Mining (GRM) problems. GIM and GRM use an efficient and intuitive computational model based purely on vector valued functions. The semantics of the interactions, their interestingness measures and the type of data considered are flexible components of vectorised frameworks. By separating the semantics of a problem from the algorithm used to mine it, the frameworks allow both to vary independently of each other. This makes it easier to develop new methods by focusing purely on a problem's semantics and removing the burden of designing an efficient algorithm. By encoding interactions as vectors in the space (or a sub-space) of samples, they provide an intuitive geometric interpretation that inspires novel methods. By operating in time linear in the number of interesting interactions that need to be examined, the GIM and GRM algorithms are optimal. The use of GRM or GIM provides efficient solutions to a range of problems in this thesis, including graph mining, counting based methods, itemset mining, clique mining, a clustering problem, complex pattern mining, negative pattern mining, solving an optimisation problem, spatial data mining, probabilistic itemset mining, probabilistic association rule mining, feature selection and generation, classification and multiplication rule mining. Data mining is a hypothesis generating endeavour, examining large databases for patterns suggesting novel and useful knowledge to the user. Since the database is a sample, the patterns found should describe hypotheses about the underlying process generating the data. In searching for these patterns, a DM algorithm makes additional hypothesis when it prunes the search space. Natural questions to ask then, are: "Does the algorithm find patterns that are statistically significant?" and "Did the algorithm make significant decisions during its search?". Such questions address the quality of patterns found though data mining and the confidence that a user can have in utilising them. Finally, statistics has a range of useful tools and measures that are applicable in data mining. In this context, this thesis incorporates statistical techniques -- in particular, non-parametric significance tests and correlation -- directly into novel data mining approaches. This idea is applied to statistically significant and relatively class correlated rule based classification of imbalanced data sets; significant frequent itemset mining; mining complex correlation structures between variables for feature selection; mining correlated multiplication rules for interaction mining and feature generation; and conjunctive correlation rules for classification. The application of GIM or GRM to these problems lead to efficient and intuitive solutions. Frequent itemset mining (FIM) is a fundamental problem in data mining. While it is usually assumed that the items occurring in a transaction are known for certain, in many applications the data is inherently noisy or probabilistic; such as adding noise in privacy preserving data mining applications, aggregation or grouping of records leading to estimated purchase probabilities, and databases capturing naturally uncertain phenomena. The consideration of existential uncertainty of item(sets) makes traditional techniques inapplicable. Prior to the work in this thesis, itemsets were mined if their expected support is high. This returns only an estimate, ignores the probability distribution of support, provides no confidence in the results, and can lead to scenarios where itemsets are labeled frequent even if they are more likely to be infrequent. Clearly, this is undesirable. This thesis proposes and solves the Probabilistic Frequent Itemset Mining (PFIM) problem, where itemsets are considered interesting if the probability that they are frequent is high. The problem is solved under the possible worlds model and a proposed probabilistic framework for PFIM. Novel and efficient methods are developed for computing an itemset's exact support probability distribution and frequentness probability, using the Poisson binomial recurrence, generating functions, or a Normal approximation. Incremental methods are proposed to answer queries such as finding the top-k probabilistic frequent itemsets. A number of specialised PFIM algorithms are developed, with each being more efficient than the last: ProApriori is the first solution to PFIM and is based on candidate generation and testing. ProFP-Growth is the first probabilistic FP-Growth type algorithm and uses a proposed probabilistic frequent pattern tree (Pro-FPTree) to avoid candidate generation. Finally, the application of GIM leads to GIM-PFIM; the fastest known algorithm for solving the PFIM problem. It achieves orders of magnitude improvements in space and time usage, and leads to an intuitive subspace and probability-vector based interpretation of PFIM.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Wed, 8 Dec 2010 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/12764/ https://edoc.ub.uni-muenchen.de/12764/2/Kohlmann_Mareike.pdf Kohlmann, Mareike ddc:000, ddc:004, Fakultät für Mathematik, Informati
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
UML state machines are a widely used language for modeling software behavior. They are considered to be simple and intuitively comprehensible, and are hence one of the most popular languages for modeling reactive components. However, this seeming ease to use vanishes rapidly as soon as the complexity of the system to model increases. In fact, even state machines modeling ``almost trivial'' behavior may get rather hard to understand and error-prone. In particular, synchronization of parallel regions and history-based features are often difficult to model in UML state machines. We therefore propose High-Level Aspect (HiLA), a new, aspect-oriented extension of UML state machines, which can improve the modularity, thus the comprehensibility and reusability of UML state machines considerably. Aspects are used to define additional or alternative system behaviors at certain ``interesting'' points of time in the execution of the state machine, and achieve a high degree of separation of concerns. The distinguishing feature of HiLA w.r.t. other approaches of aspect-oriented state machines is that HiLA aspects are defined on a high, i.e. semantic level as opposed to a low, i.e. syntactic level. This semantic approach makes HiLA aspects often simpler and better comprehensible than aspects of syntactic approaches. The contributions of this thesis include 1) the abstract and the concrete syntax of HiLA, 2) the weaving algorithms showing how the (additional or alternative) behaviors, separately modeled in aspects, are composed with the base state machine, giving the complete behavior of the system, 3) a formal semantics for HiLA aspects to define how the aspects are activated and (after the execution) left. We also discuss what conflicts between HiLA aspects are possible and how to detect them. The practical applicability of HiLA is shown in a case study of a crisis management system.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Tue, 23 Nov 2010 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/12440/ https://edoc.ub.uni-muenchen.de/12440/1/Philip_Mayer.pdf Mayer, Philip ddc:000, ddc:004, Fakultät für Mathematik, Informatik und Statistik
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Performanzprobleme treten in der Praxis von IT-Anwendungen häufig auf, trotz steigender Hardwareleistung und verschiedenster Ansätze zur Entwicklung performanter Software im Softwarelebenszyklus. Modellbasierte Performanzanalysen ermöglichen auf Basis von Entwurfsartefakten eine Prävention von Performanzproblemen. Bei bestehenden oder teilweise implementierten IT-Anwendungen wird versucht, durch Hardwareskalierung oder Optimierung des Codes Performanzprobleme zu beheben. Beide Ansätze haben Nachteile: modellbasierte Ansätze werden durch die benötigte hohe Expertise nicht generell genutzt, die nachträgliche Optimierung ist ein unsystematischer und unkoordinierter Prozess. Diese Dissertation schlägt einen neuen Ansatz zur Performanzanalyse für eine nachfolgende Optimierung vor. Mittels eines Experiments werden Performanzwechselwirkungen in der IT-Anwendung identifiziert. Basis des Experiments, das Analyseinstrumentarium, ist eine zielgerichtete, zeitliche Variation von Start-, Endzeitpunkt oder Laufzeitdauer von Abläufen der IT-Anwendung. Diese Herangehensweise ist automatisierbar und kann strukturiert und ohne hohen Lernaufwand im Softwareentwicklungsprozess angewandt werden. Mittels der Turingmaschine wird bewiesen, dass durch die zeitliche Variation des Analyseinstrumentariums die Korrektheit von sequentiellen Berechnung beibehalten wird. Dies wird auf nebenläufige Systeme mittels der parallelen Registermaschine erweitert und diskutiert. Mit diesem praxisnahen Maschinenmodell wird dargelegt, dass die entdeckten Wirkzusammenhänge des Analyseinstrumentariums Optimierungskandidaten identifizieren. Eine spezielle Experimentierumgebung, in der die Abläufe eines Systems, bestehend aus Software und Hardware, programmierbar variiert werden können, wird mittels einer Virtualisierungslösung realisiert. Techniken zur Nutzung des Analyseinstrumentariums durch eine Instrumentierung werden angegeben. Eine Methode zur Ermittlung von Mindestanforderungen von IT-Anwendungen an die Hardware wird präsentiert und mittels der Experimentierumgebung anhand von zwei Szenarios und dem Android Betriebssystem exemplifiziert. Verschiedene Verfahren, um aus den Beobachtungen des Experiments die Optimierungskandidaten des Systems zu eruieren, werden vorgestellt, klassifiziert und evaluiert. Die Identifikation von Optimierungskandidaten und -potenzial wird an Illustrationsszenarios und mehreren großen IT-Anwendungen mittels dieser Methoden praktisch demonstriert. Als konsequente Erweiterung wird auf Basis des Analyseinstrumentariums eine Testmethode zum Validieren eines Systems gegenüber nicht deterministisch reproduzierbaren Fehlern, die auf Grund mangelnder Synchronisationsmechanismen (z.B. Races) oder zeitlicher Abläufe entstehen (z.B. Heisenbugs, alterungsbedingte Fehler), angegeben.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Analyzing, understanding and working with complex systems and large datasets has become a familiar challenge in the information era. The explosion of data worldwide affects nearly every part of society, particularly the science, engineering, health, and financial domains. Looking, for instance at the automotive industry, engineers are confronted with the enormously increased complexity of vehicle electronics. Over the years, a large number of advanced functions, such as ACC (adaptive cruise control), rear seat entertainment systems or automatic start/stop engines, has been integrated into the vehicle. Thereby, the functions have been more and more distributed over the vehicle, leading to the introduction of several communication networks. Overlooking all relevant data facets, understanding dependencies, analyzing the flow of messages and tracking down problems in these networks has become a major challenge for automotive engineers. Promising approaches to overcome information overload and to provide insight into complex data are Information Visualization (InfoVis) and Visual Analytics (VA). Over the last decades, these research communities spent much effort on developing new methods to help users obtain insight into complex data. However, few of these solutions have yet reached end users, and moving research into practice remains one of the great challenges in visual data analysis. This situation is particularly true for large company settings, where very little is known about additional challenges, obstacles and requirements in InfoVis/VA development and evaluation. Users have to be better integrated into our research processes in terms of adequate requirements analysis, understanding practices and challenges, developing well-directed, user-centered technologies and evaluating their value within a realistic context. This dissertation explores a novel InfoVis/VA application area, namely in-car communication networks, and demonstrates how information visualization methods and techniques can help engineers to work with and better understand these networks. Based on a three-year internship with a large automotive company and the close cooperation with domain experts, I grounded a profound understanding of specific challenges, requirements and obstacles for InfoVis/VA application in this area and learned that “designing with not for the people” is highly important for successful solutions. The three main contributions of this dissertation are: (1) An empirical analysis of current working practices of automotive engineers and the derivation of specific design requirements for InfoVis/VA tools; (2) the successful application and evaluation of nine prototypes, including the deployment of five systems; and (3) based on the three-year experience, a set of recommendations for developing and evaluating InfoVis systems in large company settings. I present ethnographic studies with more than 150 automotive engineers. These studies helped us to understand currently used tools, the underlying data, tasks as well as user groups and to categorize the field into application sub-domains. Based on these findings, we propose implications and recommendations for designing tools to support current practices of automotive network engineers with InfoVis/VA technologies. I also present nine InfoVis design studies that we built and evaluated with automotive domain experts and use them to systematically explore the design space of applying InfoVis to in-car communication networks. Each prototype was developed in a user-centered, participatory process, respectively with a focus on a specific sub-domain of target users with specific data and tasks. Experimental results from studies with real users are presented, that show that the visualization prototypes can improve the engineers’ work in terms of working efficiency, better understanding and novel insights. Based on lessons learned from repeatedly designing and evaluating our tools together with domain experts at a large automotive company, I discuss challenges and present recommendations for deploying and evaluating VA/InfoVis tools in large company settings. I hope that these recommendations can guide other InfoVis researchers and practitioners in similar projects by providing them with new insights, such as the necessity for close integration with current tools and given processes, distributed knowledge and high degree of specialization, and the importance of addressing prevailing mental models and time restrictions. In general, I think that large company settings are a promising and fruitful field for novel InfoVis applications and expect our recommendations to be useful tools for other researchers and tool designers.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Steffen Jost researched a novel static program analysis that automatically infers formally guaranteed upper bounds on the use of compositional quantitative resources. The technique is based on the manual amortised complexity analysis. Inference is achieved through a type system annotated with linear constraints. Any solution to the collected constraints yields the coefficients of a formula, that expresses an upper bound on the resource consumption of a program through the sizes of its various inputs. The main result is the formal soundness proof of the proposed analysis for a functional language. The strictly evaluated language features higher-order types, full mutual recursion, nested data types, suspension of evaluation, and can deal with aliased data. The presentation focuses on heap space bounds. Extensions allowing the inference of bounds on stack space usage and worst-case execution time are demonstrated for several realistic program examples. These bounds were inferred by the created generic implementation of the technique. The implementation is highly efficient, and solves even large examples within seconds.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
In recent years the digital media has influenced many areas of our life. The transition from analogue to digital has substantially changed our ways of dealing with media collections. Today‟s interfaces for managing digital media mainly offer fixed linear models corresponding to the underlying technical concepts (folders, events, albums, etc.), or the metaphors borrowed from the analogue counterparts (e.g., stacks, film rolls). However, people‟s mental interpretations of their media collections often go beyond the scope of linear scan. Besides explicit search with specific goals, current interfaces can not sufficiently support the explorative and often non-linear behavior. This dissertation presents an exploration of interface design to enhance the browsing experience with media collections. The main outcome of this thesis is a new model of Exploratory Browsing to guide the design of interfaces to support the full range of browsing activities, especially the Exploratory Browsing. We define Exploratory Browsing as the behavior when the user is uncertain about her or his targets and needs to discover areas of interest (exploratory), in which she or he can explore in detail and possibly find some acceptable items (browsing). According to the browsing objectives, we group browsing activities into three categories: Search Browsing, General Purpose Browsing and Serendipitous Browsing. In the context of this thesis, Exploratory Browsing refers to the latter two browsing activities, which goes beyond explicit search with specific objectives. We systematically explore the design space of interfaces to support the Exploratory Browsing experience. Applying the methodology of User-Centered Design, we develop eight prototypes, covering two main usage contexts of browsing with personal collections and in online communities. The main studied media types are photographs and music. The main contribution of this thesis lies in deepening the understanding of how people‟s exploratory behavior has an impact on the interface design. This thesis contributes to the field of interface design for media collections in several aspects. With the goal to inform the interface design to support the Exploratory Browsing experience with media collections, we present a model of Exploratory Browsing, covering the full range of exploratory activities around media collections. We investigate this model in different usage contexts and develop eight prototypes. The substantial implications gathered during the development and evaluation of these prototypes inform the further refinement of our model: We uncover the underlying transitional relations between browsing activities and discover several stimulators to encourage a fluid and effective activity transition. Based on this model, we propose a catalogue of general interface characteristics, and employ this catalogue as criteria to analyze the effectiveness of our prototypes. We also present several general suggestions for designing interfaces for media collections.
Sozialwissenschaftliche Fakultät - Digitale Hochschulschriften der LMU
Dreimal täglich googeln: In Deutschland recherchieren mehr als achtzig Prozent der Internetnutzer regelmäßig in Suchmaschinen oder Webkatalogen – unabhängig von Alter, Geschlecht und Bildung. Das Internet etabliert sich für weite Kreise der Bevölkerung zum Recherchemedium Nummer eins. Private Blogs, Foren, Wikis und Newsportale – die Glaubwürdigkeit der Informationen im Web variiert erheblich. Suchmaschinen spiegeln diese Heterogenität wider. Andreas Tremel beschäftigt sich mit der Frage, wie die Nutzer mit unterschiedlich glaubwürdigen Fundstücken in Suchmaschinen umgehen. Befragt man Nutzer zur Bedeutung der Glaubwürdigkeit von Informationen im Web, ist diese offenbar das Selektionskriterium. Dennoch scheinen Nutzer naiv, wenig aufgeklärt und oberflächlich im Umgang mit Suchergebnissen. Um das tatsächliche Verhalten abhängig von der Treffer-Glaubwürdigkeit erstmalig beobachten zu können, wurde eine Suchmaschinensimulation entwickelt, die eine experimentelle Manipulation im Rahmen einer Feldstudie (n=400) ermöglichte. Vor dem Hintergrund der Glaubwürdigkeit wurden auch die Nutzung von Keyword-Werbung und Einflüsse des Recherche-Involvements untersucht.
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Mon, 26 Apr 2010 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/12649/ https://edoc.ub.uni-muenchen.de/12649/1/Koenig_Ralf.pdf König, Ralf ddc:000, ddc:004, Fakultät
Fakultät für Geschichts- und Kunstwissenschaften - Digitale Hochschulschriften der LMU
Wed, 28 Jan 2009 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/12436/ https://edoc.ub.uni-muenchen.de/12436/1/Leimbach_Timo.pdf Leimbach, Timo ddc:000, ddc:004, Fakultät für Geschichts- und Kunstwissenschaften
Fakultät für Mathematik, Informatik und Statistik - Digitale Hochschulschriften der LMU - Teil 01/02
Mon, 26 Jan 2009 12:00:00 +0100 https://edoc.ub.uni-muenchen.de/9612/ https://edoc.ub.uni-muenchen.de/9612/1/Gao_Huaien.pdf Gao, Huaien ddc:000, ddc:004, Fakultät für Mathematik, Informatik und Statistik
Fakultät für Kulturwissenschaften - Digitale Hochschulschriften der LMU
Kritische Edition der arabischen Version von Galens De differentiis febrium auf Grundlage sämtlicher erhaltener Handschriften. Die deutsche Übersetzung wird en face gegeben. Im Anschluss Edition der Alexandrinischen Summarien der Fieberschrift.