List of tables list of figures list of abbreviations




Скачать 284.07 Kb.
НазваниеList of tables list of figures list of abbreviations
страница1/13
Дата07.10.2012
Размер284.07 Kb.
ТипДокументы
  1   2   3   4   5   6   7   8   9   ...   13


TITLE OF A MASTER’S THESIS ABOUT CIRCUITRY AND


GENERAL ELECTRICAL COMPONENTS OF THE


TYPE TI HAS TECH STUDENTS WORK ON


by


GURU DUTT, B.S.


A THESIS


IN


ELECTRICAL ENGINEERING


Submitted to the Graduate Faculty

of Texas Tech University in

Partial Fulfillment of

the Requirements for

the Degree of


MASTER OF SCIENCE


Approved


Richard Gale

Chairperson of the Committee


Michael Parten


Accepted


Fred Hartmeister

Dean of the Graduate School


May, 2009

ACKNOWLEDGMENTS


A journey is easier when you travel together. Interdependence is certainly more valuable than independence. This thesis is the result of one and half years of work whereby I have been accompanied and supported by many people. It is a pleasant aspect that I have now the opportunity to express my gratitude for all of them.

To start with, all glory to be the Lord Almighty without whose continued support and help I would not be completing this thesis research.

TABLE OF CONTENTS




ABSTRACT


Researchers in the past have chosen 2:1 as the ratio for splitting the dataset into training and test datasets for constructing the classification model when using neural network algorithms. Moreover, very little research has been performed in the past involving large datasets (more than 1000 instances). This thesis research studies the relationship of training data size to error rate and does a detailed performance comparison study between the two algorithms: multilayer perceptron algorithm and voted perceptron algorithm. Moreover, the datasets used in constructing the classification model are all large datasets.

One of the conclusions that this research draws is that even though a ratio of 95:05 produces maximum accuracy in case of large datasets using the two neural network algorithms, there is little difference between the error rate obtained at 95:05 split ratio and those obtained at other split ratios.

One of the algorithms in this thesis research, the Voted-perceptron algorithm, is known to classify linearly separable datasets. Due to complexity present in real-worl datasets, very rarely are the real world datasets completely linearly-separable. The authors of the algorithm have used datasets in their experiments which are not linearly-separable. In this thesis research, an attempt has been made to classify datasets as linearly classifiable, not linearly classifiable, and partially linearly classifiable.

LIST OF TABLES




LIST OF FIGURES



LIST OF ABBREVIATIONS



MLP Multilayer Perceptron


VP Voted Perceptron


ARFF Attribute-Related File Format


CSV Comma-Separated Values


CMC Contraceptive Method Choice

CHAPTER I

INTRODUCTION


Data stored in databases and data warehouses throughout the world is increasing at a high pace due to the availability of various modern data-capturing and data storage devices, but seldom is the data even looked-at once in future. Therefore, mining such databases and large data warehouses for interesting knowledge would be helpful. As the databases are large and complex, a need exits for powerful tools for their analysis. The process of inferring such interesting knowledge from huge databases and data warehouses is called Data Mining. Data mining is a field in itself which finds its roots in several others fields like machine learning, statistics, databases, etc. Data mining is the process of employing one or more computer learning techniques to analyze and extract knowledge from data contained within a database [RG03]. The components that are in data mining include classification and clustering through the use of association rules, regression and pattern discovery.

A few examples where classification finds use include credit card fraud detection, medical diagnosis, bank loan evaluation, and auto insurance risk evaluation. For example, a customer could be classified in the credit risk category as good or poor. One may classify diseases and provide the symptoms which describe each class. In order to classify the data stored in the database, it is first divided into two sets called training dataset and test dataset. The training dataset is analyzed to identify key characteristics which are then used to construct a model, called a classifier, usually consisting of classification rules for the given dataset. Then this model is used to classify new data after it is tested for correctness on the test dataset.
  1   2   3   4   5   6   7   8   9   ...   13

Похожие:

List of tables list of figures list of abbreviations iconList of tables list of figures

List of tables list of figures list of abbreviations iconList of tables list of figures

List of tables list of figures list of abbreviations iconList of Tables and Figures

List of tables list of figures list of abbreviations iconList of Tables and Figures

List of tables list of figures list of abbreviations iconList of Figures (in Power Point file atada seismicattribute figures)

List of tables list of figures list of abbreviations iconComplete reference list for text and tables, including Data Depository tables, to accompany Fall 2010

List of tables list of figures list of abbreviations iconList of tables

List of tables list of figures list of abbreviations iconList of tables

List of tables list of figures list of abbreviations iconList of Tables

List of tables list of figures list of abbreviations iconList of abbreviations (the name of periodicals)

Разместите кнопку на своём сайте:
Библиотека


База данных защищена авторским правом ©lib.znate.ru 2014
обратиться к администрации
Библиотека
Главная страница