validation vs test vs training accuracy, which one to compare for claiming overfit?Which observation to use when doing k-fold validation or boostrap?why k-fold cross validation (CV) overfits? Or why discrepancy occurs between CV and test set?Consistently inconsistent cross-validation results that are wildly different from original model accuracyWhy use both validation set and test set?Reporting test result for cross-validation with Neural Networkvalidation/training accuracy and overfittingValidation accuracy for neural networkTraining score at parameter tuning lower than on hold out test set (RandomForestClassifier)Terminology - cross-validation, testing and validation set for classification taskValidation accuracy is always close to training accuracy

Making a sword in the stone, in a medieval world without magic

Provisioning profile doesn't include the application-identifier and keychain-access-groups entitlements

What is the blue range indicating on this manifold pressure gauge?

How to make readers know that my work has used a hidden constraint?

What is the likely impact on flights of grounding an entire aircraft series?

Touchscreen-controlled dentist office snowman collector game

Unreachable code, but reachable with exception

Is a lawful good "antagonist" effective?

Plywood subfloor won't screw down in a trailer home

Do f-stop and exposure time perfectly cancel?

Sword in the Stone story where the sword was held in place by electromagnets

Latest web browser compatible with Windows 98

How is the Swiss post e-voting system supposed to work, and how was it wrong?

How can I discourage/prevent PCs from using door choke-points?

Does splitting a potentially monolithic application into several smaller ones help prevent bugs?

Confusion with the nameplate of an induction motor

Force user to remove USB token

Why don't MCU characters ever seem to have language issues?

Silly Sally's Movie

Does the Bracer of Flying Daggers benefit from the Dueling Fighting style?

Who is our nearest neighbor

Co-worker team leader wants to inject the crap software product of his friends into our development. What should I say to our common boss?

Is all copper pipe pretty much the same?

US to Europe trip with Canada layover- is 52 minutes enough?



validation vs test vs training accuracy, which one to compare for claiming overfit?


Which observation to use when doing k-fold validation or boostrap?why k-fold cross validation (CV) overfits? Or why discrepancy occurs between CV and test set?Consistently inconsistent cross-validation results that are wildly different from original model accuracyWhy use both validation set and test set?Reporting test result for cross-validation with Neural Networkvalidation/training accuracy and overfittingValidation accuracy for neural networkTraining score at parameter tuning lower than on hold out test set (RandomForestClassifier)Terminology - cross-validation, testing and validation set for classification taskValidation accuracy is always close to training accuracy













4












$begingroup$


I have read on the several answers here and on the internet that cross-validation helps to indicate that if the model will generalize well or not and about overfitting.



But I am confused that which two accuracies/errors amoung test/training/validation should I compare to be able to see if the model is overfitting or not?



For example:



I divide my data for 70% training and 30% test.



When I get to run 10 fold cross-validation, I get 10 accuracies that I can take the average/mean of. should I call this mean as validation accuracy?



Afterward, I test the model on 30% test data and get Test Accuracy.



In this case, what will be training accuracy? and which two accuracies I compare to see if the model is overfitting or not?



This is my first question on this platform so please ignore errors.










share|improve this question







New contributor




A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    4












    $begingroup$


    I have read on the several answers here and on the internet that cross-validation helps to indicate that if the model will generalize well or not and about overfitting.



    But I am confused that which two accuracies/errors amoung test/training/validation should I compare to be able to see if the model is overfitting or not?



    For example:



    I divide my data for 70% training and 30% test.



    When I get to run 10 fold cross-validation, I get 10 accuracies that I can take the average/mean of. should I call this mean as validation accuracy?



    Afterward, I test the model on 30% test data and get Test Accuracy.



    In this case, what will be training accuracy? and which two accuracies I compare to see if the model is overfitting or not?



    This is my first question on this platform so please ignore errors.










    share|improve this question







    New contributor




    A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      4












      4








      4





      $begingroup$


      I have read on the several answers here and on the internet that cross-validation helps to indicate that if the model will generalize well or not and about overfitting.



      But I am confused that which two accuracies/errors amoung test/training/validation should I compare to be able to see if the model is overfitting or not?



      For example:



      I divide my data for 70% training and 30% test.



      When I get to run 10 fold cross-validation, I get 10 accuracies that I can take the average/mean of. should I call this mean as validation accuracy?



      Afterward, I test the model on 30% test data and get Test Accuracy.



      In this case, what will be training accuracy? and which two accuracies I compare to see if the model is overfitting or not?



      This is my first question on this platform so please ignore errors.










      share|improve this question







      New contributor




      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I have read on the several answers here and on the internet that cross-validation helps to indicate that if the model will generalize well or not and about overfitting.



      But I am confused that which two accuracies/errors amoung test/training/validation should I compare to be able to see if the model is overfitting or not?



      For example:



      I divide my data for 70% training and 30% test.



      When I get to run 10 fold cross-validation, I get 10 accuracies that I can take the average/mean of. should I call this mean as validation accuracy?



      Afterward, I test the model on 30% test data and get Test Accuracy.



      In this case, what will be training accuracy? and which two accuracies I compare to see if the model is overfitting or not?



      This is my first question on this platform so please ignore errors.







      machine-learning cross-validation accuracy overfitting






      share|improve this question







      New contributor




      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 6 hours ago









      A.BA.B

      1234




      1234




      New contributor




      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      A.B is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          2 Answers
          2






          active

          oldest

          votes


















          4












          $begingroup$


          When I get to run 10 fold cross-validation, I get 10 accuracies that I
          can take the average/mean of. should I call this mean as validation
          accuracy?




          No. It is a [estimate of] test accuracy.

          The difference between validation and test sets (and their corresponding accuracies) is that validation set is used to build/select a better model (e.g. avoid over-fitting), meaning it affects the final model. However, since 10-fold CV always tests an already-built model, and it is not used here to select between models, its 10% held-out is a test set not a validation set.




          Afterward, I test the model on 30% test data and get Test Accuracy.




          If you don't use the K-fold to select between multiple models, this part is not needed, run K-fold on 100% of data to get the test accuracy. Otherwise, you should keep a final test set, since the result of K-fold would be a validation accuracy.




          In this case, what will be training accuracy?




          From each of 10 folds you can get a test accuracy on 10% of data, and a training accuracy on 90% of data. In python, method cross_val_score only returns the test accuracies. Here is how to get both:



          from sklearn import model_selection
          from sklearn import datasets
          from sklearn import svm

          iris = datasets.load_iris()
          clf = svm.SVC(kernel='linear', C=1)
          scores = model_selection.cross_validate(clf, iris.data, iris.target, cv=5, return_train_score=True)
          print('Train scores:')
          print(scores['train_score'])
          print('Test scores:')
          print(scores['test_score'])



          and which two accuracies I compare to see if the model is overfitting or not?




          You should compare the training and test accuracies to identify over-fitting. A training accuracy subjectively far higher than test accuracy indicates over-fitting.



          I suggest "Bias and Variance" and "Learning curves" parts of "Machine Learning Yearning - Andrew Ng". It presents plots and interpretations for all the cases with a clear narration.



          More on validation set



          Validation set shows up in two general cases: (1) building a model, and (2) selecting between multiple models,



          1. Two examples for building a model: we (a) stop training a neural network, or (b) stop pruning a decision tree when accuracy of model on validation set starts to decrease. Then, we test the final model on a held-out set, to get the test accuracy.


          2. An example for selecting between multiple models: we do K-fold CV on a SVM and a decision tree, then we select the one with higher validation accuracy. Finally, we test the selected model on a held-out set, to get the test accuracy.






          share|improve this answer











          $endgroup$








          • 1




            $begingroup$
            I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
            $endgroup$
            – Ben Reiniger
            4 hours ago











          • $begingroup$
            @BenReiniger You are right I should clear this case.
            $endgroup$
            – Esmailian
            4 hours ago










          • $begingroup$
            @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
            $endgroup$
            – A.B
            2 hours ago










          • $begingroup$
            @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
            $endgroup$
            – Esmailian
            2 hours ago






          • 1




            $begingroup$
            Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
            $endgroup$
            – A.B
            2 hours ago


















          3












          $begingroup$

          Cross validation splits your data into K folds. Each fold contains a set of training data and test data. You are correct that you get K different error rates that you then take the mean of. These error rates come from the test set of each of your K folds. If you want to get the training error rate, you would calculate the error rate on the training part of each of these K folds and then take the average.






          share|improve this answer









          $endgroup$












          • $begingroup$
            Thank you for answer.
            $endgroup$
            – A.B
            2 hours ago










          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          A.B is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47263%2fvalidation-vs-test-vs-training-accuracy-which-one-to-compare-for-claiming-overf%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          4












          $begingroup$


          When I get to run 10 fold cross-validation, I get 10 accuracies that I
          can take the average/mean of. should I call this mean as validation
          accuracy?




          No. It is a [estimate of] test accuracy.

          The difference between validation and test sets (and their corresponding accuracies) is that validation set is used to build/select a better model (e.g. avoid over-fitting), meaning it affects the final model. However, since 10-fold CV always tests an already-built model, and it is not used here to select between models, its 10% held-out is a test set not a validation set.




          Afterward, I test the model on 30% test data and get Test Accuracy.




          If you don't use the K-fold to select between multiple models, this part is not needed, run K-fold on 100% of data to get the test accuracy. Otherwise, you should keep a final test set, since the result of K-fold would be a validation accuracy.




          In this case, what will be training accuracy?




          From each of 10 folds you can get a test accuracy on 10% of data, and a training accuracy on 90% of data. In python, method cross_val_score only returns the test accuracies. Here is how to get both:



          from sklearn import model_selection
          from sklearn import datasets
          from sklearn import svm

          iris = datasets.load_iris()
          clf = svm.SVC(kernel='linear', C=1)
          scores = model_selection.cross_validate(clf, iris.data, iris.target, cv=5, return_train_score=True)
          print('Train scores:')
          print(scores['train_score'])
          print('Test scores:')
          print(scores['test_score'])



          and which two accuracies I compare to see if the model is overfitting or not?




          You should compare the training and test accuracies to identify over-fitting. A training accuracy subjectively far higher than test accuracy indicates over-fitting.



          I suggest "Bias and Variance" and "Learning curves" parts of "Machine Learning Yearning - Andrew Ng". It presents plots and interpretations for all the cases with a clear narration.



          More on validation set



          Validation set shows up in two general cases: (1) building a model, and (2) selecting between multiple models,



          1. Two examples for building a model: we (a) stop training a neural network, or (b) stop pruning a decision tree when accuracy of model on validation set starts to decrease. Then, we test the final model on a held-out set, to get the test accuracy.


          2. An example for selecting between multiple models: we do K-fold CV on a SVM and a decision tree, then we select the one with higher validation accuracy. Finally, we test the selected model on a held-out set, to get the test accuracy.






          share|improve this answer











          $endgroup$








          • 1




            $begingroup$
            I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
            $endgroup$
            – Ben Reiniger
            4 hours ago











          • $begingroup$
            @BenReiniger You are right I should clear this case.
            $endgroup$
            – Esmailian
            4 hours ago










          • $begingroup$
            @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
            $endgroup$
            – A.B
            2 hours ago










          • $begingroup$
            @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
            $endgroup$
            – Esmailian
            2 hours ago






          • 1




            $begingroup$
            Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
            $endgroup$
            – A.B
            2 hours ago















          4












          $begingroup$


          When I get to run 10 fold cross-validation, I get 10 accuracies that I
          can take the average/mean of. should I call this mean as validation
          accuracy?




          No. It is a [estimate of] test accuracy.

          The difference between validation and test sets (and their corresponding accuracies) is that validation set is used to build/select a better model (e.g. avoid over-fitting), meaning it affects the final model. However, since 10-fold CV always tests an already-built model, and it is not used here to select between models, its 10% held-out is a test set not a validation set.




          Afterward, I test the model on 30% test data and get Test Accuracy.




          If you don't use the K-fold to select between multiple models, this part is not needed, run K-fold on 100% of data to get the test accuracy. Otherwise, you should keep a final test set, since the result of K-fold would be a validation accuracy.




          In this case, what will be training accuracy?




          From each of 10 folds you can get a test accuracy on 10% of data, and a training accuracy on 90% of data. In python, method cross_val_score only returns the test accuracies. Here is how to get both:



          from sklearn import model_selection
          from sklearn import datasets
          from sklearn import svm

          iris = datasets.load_iris()
          clf = svm.SVC(kernel='linear', C=1)
          scores = model_selection.cross_validate(clf, iris.data, iris.target, cv=5, return_train_score=True)
          print('Train scores:')
          print(scores['train_score'])
          print('Test scores:')
          print(scores['test_score'])



          and which two accuracies I compare to see if the model is overfitting or not?




          You should compare the training and test accuracies to identify over-fitting. A training accuracy subjectively far higher than test accuracy indicates over-fitting.



          I suggest "Bias and Variance" and "Learning curves" parts of "Machine Learning Yearning - Andrew Ng". It presents plots and interpretations for all the cases with a clear narration.



          More on validation set



          Validation set shows up in two general cases: (1) building a model, and (2) selecting between multiple models,



          1. Two examples for building a model: we (a) stop training a neural network, or (b) stop pruning a decision tree when accuracy of model on validation set starts to decrease. Then, we test the final model on a held-out set, to get the test accuracy.


          2. An example for selecting between multiple models: we do K-fold CV on a SVM and a decision tree, then we select the one with higher validation accuracy. Finally, we test the selected model on a held-out set, to get the test accuracy.






          share|improve this answer











          $endgroup$








          • 1




            $begingroup$
            I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
            $endgroup$
            – Ben Reiniger
            4 hours ago











          • $begingroup$
            @BenReiniger You are right I should clear this case.
            $endgroup$
            – Esmailian
            4 hours ago










          • $begingroup$
            @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
            $endgroup$
            – A.B
            2 hours ago










          • $begingroup$
            @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
            $endgroup$
            – Esmailian
            2 hours ago






          • 1




            $begingroup$
            Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
            $endgroup$
            – A.B
            2 hours ago













          4












          4








          4





          $begingroup$


          When I get to run 10 fold cross-validation, I get 10 accuracies that I
          can take the average/mean of. should I call this mean as validation
          accuracy?




          No. It is a [estimate of] test accuracy.

          The difference between validation and test sets (and their corresponding accuracies) is that validation set is used to build/select a better model (e.g. avoid over-fitting), meaning it affects the final model. However, since 10-fold CV always tests an already-built model, and it is not used here to select between models, its 10% held-out is a test set not a validation set.




          Afterward, I test the model on 30% test data and get Test Accuracy.




          If you don't use the K-fold to select between multiple models, this part is not needed, run K-fold on 100% of data to get the test accuracy. Otherwise, you should keep a final test set, since the result of K-fold would be a validation accuracy.




          In this case, what will be training accuracy?




          From each of 10 folds you can get a test accuracy on 10% of data, and a training accuracy on 90% of data. In python, method cross_val_score only returns the test accuracies. Here is how to get both:



          from sklearn import model_selection
          from sklearn import datasets
          from sklearn import svm

          iris = datasets.load_iris()
          clf = svm.SVC(kernel='linear', C=1)
          scores = model_selection.cross_validate(clf, iris.data, iris.target, cv=5, return_train_score=True)
          print('Train scores:')
          print(scores['train_score'])
          print('Test scores:')
          print(scores['test_score'])



          and which two accuracies I compare to see if the model is overfitting or not?




          You should compare the training and test accuracies to identify over-fitting. A training accuracy subjectively far higher than test accuracy indicates over-fitting.



          I suggest "Bias and Variance" and "Learning curves" parts of "Machine Learning Yearning - Andrew Ng". It presents plots and interpretations for all the cases with a clear narration.



          More on validation set



          Validation set shows up in two general cases: (1) building a model, and (2) selecting between multiple models,



          1. Two examples for building a model: we (a) stop training a neural network, or (b) stop pruning a decision tree when accuracy of model on validation set starts to decrease. Then, we test the final model on a held-out set, to get the test accuracy.


          2. An example for selecting between multiple models: we do K-fold CV on a SVM and a decision tree, then we select the one with higher validation accuracy. Finally, we test the selected model on a held-out set, to get the test accuracy.






          share|improve this answer











          $endgroup$




          When I get to run 10 fold cross-validation, I get 10 accuracies that I
          can take the average/mean of. should I call this mean as validation
          accuracy?




          No. It is a [estimate of] test accuracy.

          The difference between validation and test sets (and their corresponding accuracies) is that validation set is used to build/select a better model (e.g. avoid over-fitting), meaning it affects the final model. However, since 10-fold CV always tests an already-built model, and it is not used here to select between models, its 10% held-out is a test set not a validation set.




          Afterward, I test the model on 30% test data and get Test Accuracy.




          If you don't use the K-fold to select between multiple models, this part is not needed, run K-fold on 100% of data to get the test accuracy. Otherwise, you should keep a final test set, since the result of K-fold would be a validation accuracy.




          In this case, what will be training accuracy?




          From each of 10 folds you can get a test accuracy on 10% of data, and a training accuracy on 90% of data. In python, method cross_val_score only returns the test accuracies. Here is how to get both:



          from sklearn import model_selection
          from sklearn import datasets
          from sklearn import svm

          iris = datasets.load_iris()
          clf = svm.SVC(kernel='linear', C=1)
          scores = model_selection.cross_validate(clf, iris.data, iris.target, cv=5, return_train_score=True)
          print('Train scores:')
          print(scores['train_score'])
          print('Test scores:')
          print(scores['test_score'])



          and which two accuracies I compare to see if the model is overfitting or not?




          You should compare the training and test accuracies to identify over-fitting. A training accuracy subjectively far higher than test accuracy indicates over-fitting.



          I suggest "Bias and Variance" and "Learning curves" parts of "Machine Learning Yearning - Andrew Ng". It presents plots and interpretations for all the cases with a clear narration.



          More on validation set



          Validation set shows up in two general cases: (1) building a model, and (2) selecting between multiple models,



          1. Two examples for building a model: we (a) stop training a neural network, or (b) stop pruning a decision tree when accuracy of model on validation set starts to decrease. Then, we test the final model on a held-out set, to get the test accuracy.


          2. An example for selecting between multiple models: we do K-fold CV on a SVM and a decision tree, then we select the one with higher validation accuracy. Finally, we test the selected model on a held-out set, to get the test accuracy.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 1 hour ago

























          answered 5 hours ago









          EsmailianEsmailian

          947110




          947110







          • 1




            $begingroup$
            I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
            $endgroup$
            – Ben Reiniger
            4 hours ago











          • $begingroup$
            @BenReiniger You are right I should clear this case.
            $endgroup$
            – Esmailian
            4 hours ago










          • $begingroup$
            @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
            $endgroup$
            – A.B
            2 hours ago










          • $begingroup$
            @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
            $endgroup$
            – Esmailian
            2 hours ago






          • 1




            $begingroup$
            Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
            $endgroup$
            – A.B
            2 hours ago












          • 1




            $begingroup$
            I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
            $endgroup$
            – Ben Reiniger
            4 hours ago











          • $begingroup$
            @BenReiniger You are right I should clear this case.
            $endgroup$
            – Esmailian
            4 hours ago










          • $begingroup$
            @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
            $endgroup$
            – A.B
            2 hours ago










          • $begingroup$
            @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
            $endgroup$
            – Esmailian
            2 hours ago






          • 1




            $begingroup$
            Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
            $endgroup$
            – A.B
            2 hours ago







          1




          1




          $begingroup$
          I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
          $endgroup$
          – Ben Reiniger
          4 hours ago





          $begingroup$
          I think I disagree with "30% test set not needed." If you are using CV to select a better model, then you are exposing the test folds (which I would call a validation set in this case) and risk overfitting there. The final test set should remain untouched (by both you and your algorithms) until the end, to estimate the final model performance (if that's something you need). But yes, while model-building, the (averaged) training fold score vs. the (averaged) validation fold score is what you're looking at for overfitting indication.
          $endgroup$
          – Ben Reiniger
          4 hours ago













          $begingroup$
          @BenReiniger You are right I should clear this case.
          $endgroup$
          – Esmailian
          4 hours ago




          $begingroup$
          @BenReiniger You are right I should clear this case.
          $endgroup$
          – Esmailian
          4 hours ago












          $begingroup$
          @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
          $endgroup$
          – A.B
          2 hours ago




          $begingroup$
          @Esmailian train_score is also an average of 10 scores? Also, to do a similar kind of thing with GridSearchCV(in case hyper paramter tuning and cross-validation are required in one step) can we use return_train_score=true? is it same?
          $endgroup$
          – A.B
          2 hours ago












          $begingroup$
          @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
          $endgroup$
          – Esmailian
          2 hours ago




          $begingroup$
          @A.B It is an array, needs to be averaged. return_train_score=true or =false only changes the returned report, underlying result is the same.
          $endgroup$
          – Esmailian
          2 hours ago




          1




          1




          $begingroup$
          Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
          $endgroup$
          – A.B
          2 hours ago




          $begingroup$
          Okay thanks, I am accepting the answer as "which accuracy is to be used" makes sense. But is it possible for you to elaborate more on "validation set is used to build/select a better model (e.g. avoid over-fitting) vs in your case, 10-fold CV tests an already-built model" for me and future readers?
          $endgroup$
          – A.B
          2 hours ago











          3












          $begingroup$

          Cross validation splits your data into K folds. Each fold contains a set of training data and test data. You are correct that you get K different error rates that you then take the mean of. These error rates come from the test set of each of your K folds. If you want to get the training error rate, you would calculate the error rate on the training part of each of these K folds and then take the average.






          share|improve this answer









          $endgroup$












          • $begingroup$
            Thank you for answer.
            $endgroup$
            – A.B
            2 hours ago















          3












          $begingroup$

          Cross validation splits your data into K folds. Each fold contains a set of training data and test data. You are correct that you get K different error rates that you then take the mean of. These error rates come from the test set of each of your K folds. If you want to get the training error rate, you would calculate the error rate on the training part of each of these K folds and then take the average.






          share|improve this answer









          $endgroup$












          • $begingroup$
            Thank you for answer.
            $endgroup$
            – A.B
            2 hours ago













          3












          3








          3





          $begingroup$

          Cross validation splits your data into K folds. Each fold contains a set of training data and test data. You are correct that you get K different error rates that you then take the mean of. These error rates come from the test set of each of your K folds. If you want to get the training error rate, you would calculate the error rate on the training part of each of these K folds and then take the average.






          share|improve this answer









          $endgroup$



          Cross validation splits your data into K folds. Each fold contains a set of training data and test data. You are correct that you get K different error rates that you then take the mean of. These error rates come from the test set of each of your K folds. If you want to get the training error rate, you would calculate the error rate on the training part of each of these K folds and then take the average.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 5 hours ago









          astelastel

          311




          311











          • $begingroup$
            Thank you for answer.
            $endgroup$
            – A.B
            2 hours ago
















          • $begingroup$
            Thank you for answer.
            $endgroup$
            – A.B
            2 hours ago















          $begingroup$
          Thank you for answer.
          $endgroup$
          – A.B
          2 hours ago




          $begingroup$
          Thank you for answer.
          $endgroup$
          – A.B
          2 hours ago










          A.B is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          A.B is a new contributor. Be nice, and check out our Code of Conduct.












          A.B is a new contributor. Be nice, and check out our Code of Conduct.











          A.B is a new contributor. Be nice, and check out our Code of Conduct.














          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47263%2fvalidation-vs-test-vs-training-accuracy-which-one-to-compare-for-claiming-overf%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Can not update quote_id field of “quote_item” table magento 2Magento 2.1 - We can't remove the item. (Shopping Cart doesnt allow us to remove items before becomes empty)Add value for custom quote item attribute using REST apiREST API endpoint v1/carts/cartId/items always returns error messageCorrect way to save entries to databaseHow to remove all associated quote objects of a customer completelyMagento 2 - Save value from custom input field to quote_itemGet quote_item data using quote id and product id filter in Magento 2How to set additional data to quote_item table from controller in Magento 2?What is the purpose of additional_data column in quote_item table in magento2Set Custom Price to Quote item magento2 from controller

          How to solve knockout JS error in Magento 2 Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?(Magento2) knockout.js:3012 Uncaught ReferenceError: Unable to process bindingUnable to process binding Knockout.js magento 2Cannot read property `scopeLabel` of undefined on Product Detail PageCan't get Customer Data on frontend in Magento 2Magento2 Order Summary - unable to process bindingKO templates are not loading in Magento 2.1 applicationgetting knockout js error magento 2Product grid not load -— Unable to process binding Knockout.js magento 2Product form not loaded in magento2Uncaught ReferenceError: Unable to process binding “if: function()return (isShowLegend()) ” magento 2

          Nissan Patrol Зміст Перше покоління — 4W60 (1951-1960) | Друге покоління — 60 series (1960-1980) | Третє покоління (1980–2002) | Четверте покоління — Y60 (1987–1998) | П'яте покоління — Y61 (1997–2013) | Шосте покоління — Y62 (2010- ) | Посилання | Зноски | Навігаційне менюОфіційний український сайтТест-драйв Nissan Patrol 2010 7-го поколінняNissan PatrolКак мы тестировали Nissan Patrol 2016рвиправивши або дописавши її